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Gene for increased somatic recombination 
TECHNICAL FIELD 

The present invention relates to DNA that encodes proteins that control somatic 
recombination, in particular in plants. 

BACKGROUND 

Cells of all organisms have evolved a series of DNA repair pathways that counteract the 
deleterious effects of DNA damage and are triggered by intricate signal cascades. 
Homologous recombination in plants stabilizes the genome by repairing damaged 
chromosomes simultaneously generating genetic variability through the creation of new 
genes and new genetic linkages. Repair of DNA damage by recombination is particularly 
significant for cells under exogenous and endogenous genotoxic stress because of its 
potential to remove a wide range of DNA lesions. The current understanding of genetic and 
molecular components underlying meiotic and somatic recombination and DNA repair in 
plants is limited. To be able to modify or improve DNA repair using gene technology it is 
necessary to identify key proteins involved in said pathways or cascades. 

The precise manipulation of the genome of higher plants still is a major challenge for plant 
genetic engineering. Some advances have been made recently for the creation of point 
mutations at predetermined positions by chimeric RNA/DNA oligonucleotides (Beetham et aL 
1999, Hohn & Puchta 1999, Zhu et al. 1999, Kipp et aL 2000, Zhu et ah 2000). However, the 
targeted insertion of longer stretches of DNA sequence at any desired location ("knock-in") 
or the replacement of predetermined plant genomic sequences by heterologous DNA 
("knock-out) via homologous recombination is at present not possible as a routine technique 
(Mengiste & Paszkowski 1999, Puchta 2002). 

Few reports have appeared in the literature that describe successful "gene targeting" in 
higher plants (Paszkowski et al. 1988, Lee et al. 1990, Offringa et al. 1990, Miao & Lam 
1995, Kempin et al. 1997, Hanin et al. 2001). but the reported absolute numbers and relative 




frequencies of the desired events were very low. Indeed, the main problem for "gene 
targeting" experiments is the low frequency of the desired homologous recombination events 
- 10-3 to 10-5 (Hohn & Puchta 1999, Mengiste & Paszkowski 1999) - relative to illegitimate 
recombination/integration events. 

Various attempts of increasing the low relative frequency of targeted homologous 
recombination events, by improved selection schemes ("positive-negative selection") or by 
providing extended regions of sequence homology, were not successful (Thykjaer et al. 
1997, Gallego et al, 1999). One promising strategy to facilitate gene targeting In higher 
plants would be to shift the balance between illegitimate and homologous recombination 
events towards the latter, by facilitating homologous recombination events in plants by 
genetic manipulation (Gherbi et al, 2001). 

One approach described in the literature is the expression in plants of heterologous proteins 
known to be involved in homologous recombination. Overproduction of the bacterial 
resolvase RuvC was shown to increase somatic inter-and intrachromosomal recombination, 
as well as extrachromosomal recombination (Shalev et al. 1999), but no gene targeting 
studies were reported yet with this system. Expression of the bacterial RecA protein had 
similar effects (Reiss et al. 1996, Reiss et al. 1997). but subsequent experiments did not 
show an increase of gene targeting events (Reiss et al. 2000). So far. It is not clear whether 
heterologous proteins can successfully interact with the plant recombination machinery to 
affect the outcome of the recombination events required for gene targeting. In addition, 
these foreign proteins might have undesired side effects in plants. 

An alternative approach is to rely on endogenous plant genes to influence the frequency of 
homologous recombination events. So far, indirect approaches have been reported to isolate 
plant genes involved in recombination. The cloning of plant orthologs to recombination and 
repair genes from other species was reported (Klimyuk & Jones 1997, Doutriaux et al, 1998, 
Hartung & Puchta 1999, Gallego et al. 2000, Lin et al. 2000), but so far the importance of 
these genes for recombination in plants has not been evaluated. Functional screens have 
been carried out to identify plant mutants hypersensitive to genotoxic treatments (Davies et 
al. 1994, Jenkins et al. 1995, Jiang et al. 1997. Masson et al. 1997, Albinsky et al. 1999, 
Mengiste et al. 1999). Since recombination is an important mechanism for DNA repair, some 
of these mutants might be affected in their recombination behavior. This was experimentally 
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demonstrated for some X-ray hypersensitive Arabidopsis mutants that also showed reduced 
levels of somatic recombination (Masson & PaszkowskI 1997), although the affected gene 
has not been isolated. Recently, a DNA damage hypersensitive Arabidopsis mutant was 
isolated from a T-DNA tagged population, the affected gene (MIM) was cloned and shown to 
encode an SMC (Structural Maintenance of Chromatin) protein. Since the mim mutant 
showed decreased frequencies of somatic recombination, MIM seems be involved In some 
aspect of somatic recombination (Mengiste et al. 1999). Also in tobacco a 
hyperrecombinogenic mutant was isolated (Gorbunova et al. 2000). However, the gene 
affected could not be isolated so far. 

Previously, a genetic system was described to study somatic homologous recombination 
between repeated sequences in whole plants (Swoboda et al. 1994, Puchta et al. 1995a, 
Puchta et al. 1995b). Briefly, a transgene canrying two non-functional halves of the p- 
glucuronidase reporter gene sharing a stretch of sequence Identity serves as a reporter 
construct. Homologous recombination between the repeated sequences results in the 
restoration of a functional reporter gene. Such events were detected by a sensitive 
histochemical assay, and confirmed by Southern blotting. This assay is destructive, since the 
staining procedure is lethal, so that direct isolation of mutants is difficult. 

Therefore, there is a need in the art to identify genes that Increase somatic recombination 
and this invention meets that need. i 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts sequences related to mbm17.5 A. predicted cDNA of mbm17.5 B. predicted 
protein sequence of MBM17.5 C. full length cDNA of mbm17.5 D. protein sequence of 
MBM17.5 E. Over-expressed transcript of mbm17.5 in mutant hw17 
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Flgure 2 depicts sequences related to mbm17.6 A. predicted cDNA of mbm17.6 (DNA 
polymerase III) B. predicted protein sequence of MBM17.6 (DNA polymerase III) 

Figure 3 depicts the partial sequence of Osmbm17.5>EST clone RICS1367A, Oryza sativa 
homolog of mbm17.5 

Figure 4 depicts the partial sequence of zmmbml 7.5>EST clone 60301 1 H1 1 , Zea mays 
homolog of mbm1 7.5 

Figure 5 depicts AtlnoSO sequence and related sequences A. AtlnoSO coding sequence B. 
AtlnoSO derived protein sequence C. Alignment of Atlno 80 sequence and public sequence, 
At3g57300, showing splicing difference ("Query" refers to AtlnoSO sequence;j;Sbjct" to 
public database sequence, gi|18410689|ref|NM_1 15590,1 1 (AGI:At3g57300) 

Figure 6 depicts the nucleptide sequences of AtRvbl (At5g22330) 

Figure 7 depicts the nucleotide sequences of AtRvb21 (At5g67630) 

Figure.8 depicts the nucleotide sequences of AtRvb22 (At3g49830) 

Figure 9 depicts the nucleotide sequences of At3g57290 

Figure 10 depicts the alignment of protein sequences from MBM17.5, 2mMBM17.5 and 
OSMBM17.5, helicase motifs are marked as squares 

SUMMARY OF THE INVENTION 

The present invention provides an isolated nucleic acid, in particular DNA, comprising a 
sequence having 98.5% or more identity with the sequences depicted in Figure 1C, Figure 
1 E or Figure 5A. Also provided are vectors and host cells comprising the nucleic acids of the 
invention, as well as polypeptides encoded by the nucleic acids. 

In a further aspect of the invention, a method for inducing homologous recombination in a 
cell is provided, comprising modulating the expression or properties of one or more gene 
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products selected from the group consisting of IVIBI\/I17.5, I\/1BI\/I17.6, osl\/IBiy/l17.5, 
zmI\/IB!\fl17.5, AtlnoSO, At3g57300. Rvb1 (At5g22330). Rvb21 (At5g67630). Rvb22 
(At3g49830) and At3g57290, tlieir homologues, fragments or derivatives. In one 
embodiment, modulation is achieved by increasing expression of the gene product, such as 
by Introducing a nucleic acid encoding the gene product into the cell operably linked to a 
promoter; and allowing transcription and translation of the gene In an amount sufficient to 
affect homologous recombination in said cell. 

The method can be used to increase somatic homologous recombination and/or meiotic 
homologous recombination. The promoter can be an inducible promoter, a tissue-specific 
promoter, a constitutive promoter or a meiosis-specific promoter, depending on the desired 
effect. 

Also provided is a method of increasing gene targetting to a desired locus in a host cell 
comprising introducing a desired gene into a host cell, modulating the expression or 
properties of one or more gene products selected from the group consisting of MBMM.5, 
MBM17.6, OSMBM17.5, zmMBM17.5, AtlnoSO, At3g57300. Rvb1 (At5g22330), Rvb21 
(At5g67630). Rvb22 (At3g49830) and At3g57290, or functional fragments, derivatives and 
homologues thereof in the host cell, and detecting integration of the desired gene at a 
selected locus in the genome of the host cell. 

DETAILED DESCRIPTION OF THE INVENTION 

The present inventors have used a direct screening approach to identify mutants of 
Arabidopsis thaliana showing increased frequencies of somatic recombination, by visualizing 
recombination events in living plants from a mutagenized population and directly isolating 
plants with the desired phenotype. The description below describes a genetic screen and 
two Arabidopsis mutants hw17 and sm22 derived from It. and the associated plant genes 
responsible for the altered recombination phenotype. 

Existing technologies for gene targeting in plants are very inefficient. The modulation of the 
expression or properties of one or more gene products selected from the group consisting of 
MBM17.5, MBM17.6, osMBM17.5, zmMBM17.5. AtlnoSO, At3g57300, Rvb1 (At5g22330) and 
Rvb2(1 and 2; also refen-ed to herein as Rvb21 or At5g67630. and Rvb22 or At3g49830, 
respectively and At3g57290. Increases the efficiency of gene targeting events and facilitates 
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the routine manipuiation of the genome of higher plants by homologous recombination. For 
the purposes of this disclosure, to avoid repetition, reference to the above group of gene 
products is meant to include reference to each gene individually, i.e., the modulation of the 
expression or properties of MBM17.5, the modulation of the expression or properties of 
MBM17.6, and so on. 

An in vivo screen for Arabidopsis mutants has been devised to allow direct detection of 
mutants with increased recombination. As a result of the screen, and mutant plants with a 
more than 10-fold increased or altered frequency of somatic recombination events are 
provided, as well as the plant genes, MBM17.5, MBM17.6, osMBM17.5, zmMBM17.5, 
AtlnoSO, At3g57300, Rvb1 (At5g22330), Rvb21 (At5g67630), Rvb22 (At3g49830) and 
At3g57290 affected in these mutant plants, and orthologs from other plant species. The 
screen allows the identification of mutant plants, and plant genes with a strong effect on 
recombination having little or no undesired side effects on the plant. An increase in 
homologous recombination frequency is useful to achieve an increased efficiency of gene 
targeting in plants. 

Within the context of the present invention reference to a gene is to be understood as 
reference to a DNA coding sequence associated with regulatory sequences, which allow 
transcription of the coding sequence into RNA such as mRNA, rRNA, tRNA, snRNA, sense 
RNA or antisense RNA. Examples of regulatory sequences are promoter sequences, 5' and 
3' untranslated sequences, introns, and termination sequences. 

A promoter is understood to be a DNA sequence initiating transcription of an associated 
DNA sequence, and may also include elements that act as regulators of gene expression 
such as activators, enhancers, or repressors. 

Expression of a gene refers to its transcription into RNA or its transcription and subsequent 
translation into protein within a living cell. In the case of antisense constructs expression 
refers to the transcription of the antisense DNA only. 

The term transformation of cells designates the introduction of nucleic acid into a host cell, 

particularly the stable integration of a DNA molecule into the genome of said cell. 

Any part or piece of a specific nucleotide or amino acid sequence is referred to as a 



-12- 



component sequence or fragment . 

In one aspect of the invention, nucleic acids and polypeptides are provided that can 
modulate homologous recombination. A nucleic acid according to the present invention 
comprises a sequence having 98.5%, 99%. 99.5% or more identity with the sequences 
depicted in Figure 1C, Figure 1E or Figure 5A. The DNA sequence in Figure 1A is 99.8% 
identical to Figure 1C. due to the different splicing. The nucleic acid can be DNA or RNA. 
such as. mRNA. rRNA, tRNA. snRIMA. sense RNA or antisense RNA. Also provided is a 
vector comprising the nucleic acid of the invention, as well as host cells comprising the 
vector or nucleic acid of the invehtiorr"Suita5lirve<abfs and host cells are described in more 
detail below. Also provided are polypeptides encoded by the nucleic acids of the invention. 

In a further aspect of the invention, methods for increasing homologous recombination are 
provided by modulating the expression or properties of one or more gene products selected 
from the group consisting of MBM17.5. MBM17.6, osMBM17.5. zmMBM17.5, AtlnoSO. 
At3g57300, Rvb1 (At5d22330), Rvb21 (At5g67630). Rvb22 (At3g49830) and At3g57290. In 
order to increase homologous recombination several methods are useful depending on the 
gene and the gene targeting technique employed. Typically, modulation will mean increasing 
the activity of the gene product, which can easily be achieved by methods known in the art. 

In one embodiment, the desired gene is overexpressed in a host cell in an amount sufficient 
to increase homologous recombination In tiie host cell. By "overexpression", it is meant 
increasing the amount of desired gene product in a host cell, compared to untreated cells. A 
simple way to achieve overexpression is to produce transgenic host cells, in particular 
transgenic plants, canrying a construct (vector) that ectopically overexpresses the sequence 
of interest under the control of a suitable promoter, such as the 35S CaMV, MAS 
(mannopine synthase) or ubiquitin promoter. 

In another embodiment, an inducible promoter Is used to allow an increase in homologous 
recombination frequency at the time and place needed, for example, for gene targeting. 

Alternatively, the construct increasing recombination can be provided at the same time as 
the targeting constmct by co-transformation, the effect is then achieved by the transient 
expression of the construct containing the said genes. 
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It will be apparent to orijB of ordinary skill in the art that functional fragments, homologues or 
derivatives of the desired gene can be used. Dynamic programming algorithms yield 
different kinds of alignments. In general there exist two approaches towards sequence 
alignment. Algorithms as proposed by Needleman & Wunsch and by Sellers align the entire 
length of two sequences providing a global alignment of the sequences. The Smith- 
Waterman algorithm on the other hand yields local alignments. A local alignment aligns the 
pair of regions within the sequences that are most similar given the choice of scoring . matrix 
and gap penalties. This allows a database search to focus on the most highly conserved 
regions of the sequences, it also allows similar domains within sequences to be identified. 
To speed up alignments using the Smith-Waterman algorithm both BLAST (Basic Local 
Alignment Search Tool) and FASTA place additional restrictions on the alignments. 

Within the context of the present invention alignments are conveniently performed using 
BLAST, a set of similarity search programs designed to explore all of the available sequence 
databases regardless of whether the query is protein or DNA. Version BLAST 2.0 (Gapped 
BLAST) of this search tool has been made publicly available on the internet (currently 
http://www.ncbi.nlm.nih.gov/BLAST/). It uses a heuristic algorithm which seeks local as 
opposed to global alignments iand is therefore able to detect relationships among sequences 
which share only isolated regions. The scores assigned in a BLAST search have a well- 
defined statistical interpretation. Particularly useful within the scope of the present invention 
are the blastp program allowing for the introduction of gaps in the local sequence alignments 
and the PSI-BLAST program, both programs comparing an amino acid query sequence 
. against protein sequence database, as well as a blastp variant program allowing local 
alignment of two sequences only. Said programs are preferably run with optional parameters 
set to the default values. 

For example, GenBank database annotation of mbm17.5 predicted a gene with similarities to 
Rad26 nucleotide excision repair proteins. Comparision of the predicted protein-coding 
segments against the GenPept/SwissProt protein database using the BLASTP program 
revealed many similar protein sequences of known function of the SWI2/SNF2 
helicase/ATPase protein family. A similarity search of the protein database revealed that the 
central region of this predicted protein of 1043 amino acids has significant similarity to a 
number of proteins Involved in DNA binding, repair, recombination, and chromatin 
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remodeling. In particular, the human protein ERCC6 (Troelstra et al. 1992), involved in 
Cocl<ayne's syndrome, and Its S. cerevisiae homologue RAD26 (van Gool et al. 1994) are 
important for the repair of active genes, the proteins RAD54 and rph54, from S. cerevisiae 
and S. pombe (Emery et al. 1991, Muris et al. 1996) and their mammalian homologues 
(Essers et al. 2000) are involved in DNA recombination and repair, and the yeast proteins 
IVIOT1 (Davis et al. 1992) and SNF2 (Laurent et al. 1991, Richmond & Peterson 1996) are 
l<nown to affect the expression of numerous genes, most likely by ATP-dependent chromatin 
remodeling. All these proteins share an extended protein sequence motif with the predicted 
product of the MBM17.5 coding sequence, the so-called helicase/ATPase domain of the 
' SWI2;sNF2T)Tdtein Tamily (Gofbalehya & Koonin 1993;-Aravind et al. 1999. Mudhardt & 
Yaniv 1999, Travers 1999) and may be useful in increasing homologous recombination 
frequency. 

Sequence alignments using BLAST can also tal^e into account whether the substitution of 
one amino acid for another is likely to conserve the physical and chemical properties 
necessary to maintain the structure and function of the protein or is more likely to disrupt 
essential structural and functional features of a protein. Such sequence similarity is 
quantified in terms of a percentage of "positive" amino acids, as compared to the percentage 
of identical amino acids and can help assigning a protein to the correct protein family in 
border-line cases. 

Specific examples of DNA and encoded proteins according to the present invention are 
described in Figures 1, 2, 3, 4, 5, 6, 7, 8 and 9. Typically, functional fragments or derivatives 
are characterized by an amino acid sequence comprising a component sequence of at least 
150 amino add residues having^40% or more identity with an aligned component sequence 
of the one or more of the polypeptides encoded by the DNA of Figures 1 to 9. Preferably the 
amino acid sequence identity is higher than 50% or even higher than 55%. 

DNA encoding proteins according to the present invention can be isolated from 
monocotyledonous and dicotyledonous plants. Preferred sources are corn, sugarbeet, 
sunflower, winter oilseed rape, soybean, cotton, wheat, rice, potato, broccoli, cauliflower, 
cabbage, cucumber, sweet corn, daikon, garden beans, lettuce, melon, pepper, squash, 
tomato, or watermelon. However, they can also be Isolated from mammalian sources such 
as mouse or human tissues. The following general method, can be used, which the person 
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skilled in the art knows to adapt to the specific task. A single stranded fragment of the 
desired gene consisting of at least 15, preferably 20 to 30 or even more than 100 
consecutive nucleotides is used as a probe to screen a DNA library for clones hybridizing to 
said fragment. The factors to be observed for hybridization are described in Sambrook et al, 
Molecular cloning: A laboratory manual, Cold Spring Harbor Laboratory Press, chapters 
9.47-9.57 and 11.45-11.49, 1989. Hybridizing clones are sequenced and DNA of clones 
comprising a complete coding region encoding a protein characterized by an amino acid 
sequence comprising a component sequence of at least 150 amino acid residues having 
40% or more sequence identity to the protein sequence encoded by the desired gene is 
purified. Said DNA can then be further processed by a number of routine recombinant DNA 
techniques such as restriction enzyme digestion, ligation, or polymerase chain reaction 
analysis. The disclosure of the nucleotide sequences in Figs 1-9 enables a person skilled in 
the art to design oligonucleotides for polymerase chain reactions which attempt to amplify 
DNA fragments from templates comprising a sequence of nucleotides characterized by any 
continuous sequence of 15 and preferably 20 to 30 or more basepairs of the desired gene. 

Suitable vectors for practicing the methods of the invention are well known in the art. 
Similalry, host cells can be derived from monocotyledonous or dicotyledonous plants. 
Preferred sources are corn, sugarbeet, sunflower, winter oilseed rape, soybean, cotton, 
wheat, rice, potato, broccoli, cauliflower, cabbage, cucumber, sweet com, daikon, garden 
beans, lettuce, melon, pepper, squash, tomato, or watermelon. However, host ceils can also 
be isolated from other sources, including mammalian sources such as mouse or human 
cells, in particular stem cells. It is preferred that mammalian homologues are used in 
mammalian cells. 

The methods for increasing homologous recombination are useful to obtain gene targeting 
so that a gene of interest is introduce into the genome at a desired locus, instead of 
randomly. For some hosts, in particular crop plants, the gene is preferably expressed in a 
selected tissue where expression is needed. This is easily achieved by the use of tissue 
specific promoter. Thus, the present invention provides a method for increasing somatic 
homologous recombination and increasing gene targetting by modulating the expression or 
properties of one or more gene products selected from the group consisting of MBM17.5, 
MBM17.6, OSMBM17.5. zmMBM17.5, AtlnoSO. At3g57300, Rvbl (At5g22330), Rvb21 




(At5g67630), Rvb22 (At3g49830) and At3g57290, and fragments, derivatives and 
homologues thereof, essentially as described above. 

The methods are also useful to improve meiotic recombination, thereby facilitating breeding 
of species, in which genes encoding a particular phenotype are transferred between plants. 
Crossing in an interesting trait from another variety or species into a given variety by 
conventional breeding is a very time and labour-intensive process. Several generations of 
back-crosses have to be earned out to eliminate the undesired genetic material of the donor 
species, while maintaining the desired phenotype or trait. Using the methods described 
above for Increasing homologous recombination, meiotic recombination frequencies can be 
increased, preferably by expressing the desired gene under the control of a meiosis-specific 
promoter or inducible promoter, the breeding process is speeded up. Thus, the present 
invention provides a method for Increasing meiotic recombination by modulating the 
expression or properties of one or more gene products selected from the group consisting of 
MBM17.5, MBM17.6, osMBM17.5. zmMBM17.5. AtlnoSO, At3g57300. Rvb1 (At5g22330), 
Rvb21 (At5g67630). Rvb22 (At3g49830) and At3g57290, and fragments, derivatives and 
homologues thereof, essentially as described above. 

The Examples below are provided for illulstrative purposes and are in no way intended to be 
limiting to the invention. 

EXAMPLES : 

Example 1: Identification of At5g63950 (MBM17.5) as gene effective in increasing 
homologous gene recombination in the mutant hw17. 

We have used for our screening a newly constmcted a transgenic Arabidopsis thaliana line 
that can-ies a recombination reporter construct based on the firefly luciferase gene. The 
stoicture of the reporter constmct - two segments of the luciferase gene anranged as 
Inverted repeats - is comparable to that of the previously described beta-glucuronidase 
reporter (Swoboda et al. 1994, Puchta et al. 1995a, Puchta et al. 1995b). but offers the 
advantage that recombination events can be detected in living plants. Luciferase activity in 
cells in which recombination has restored an intact luciferase gene can be detected by light 
emission after application of the substrate D-luciferin using a high-sensitivity CCD camera 
(Millar et al. 1992. Millar et al. 1995a, Millar et al. 1995b, Michelet & Chua 1996). 
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To induce hyperrecombination mutations in the luciferase recombination reporter line, we 
used T-DNA activation tagging with a mutagenic construct (pAC102). "Activation tagging" 
refers to the transcriptional activation of endogenous plant genes by random integration of a 
construct that carries promoter or enhancer sequences. One published approach for 
"activation tagging" is the introduction, via Agrobacterlum-mediated gene transfer, of a T- 
DNA carrying several copies of the cauliflower mosaic virus (CaMV) 35S enhancer (Fang et 
al. 1989), which can activate the expression of heterologous genes over a distance (HayashI 

et aj, 1992, Walden et al. 1994, Kakimoto 1996, Kardailsky et al. 1999^Weigel et al. 2000). 

Another published approach is the introduction of a complete, outward-pointing CaMV 35S 
promoter on a transposable Ds element (Wilson et al. 1996. Schaffer et al. 1998, Fridborg et 
al. 1999). The construct "pAC102" used for our experiments is a combination of these 
previously described elements: it is a binary vector carrying a T-DNA that can be transferred 
to plants that contains a complete, outward-pointing copy of the CaMV 35S 
promoter/enhancer close to the right T-DNA border. Thus, this construct combines the ease 
of application of T-DNA gene transfer with the genetic ability of a complete promoter, 
avoiding some of the drawbacks of enhancer-only constructs (Weigel et al. 2000). 

In principle, the activation tagging construct can cause several kinds of mutations after 
integration in the plant genome: gene disruption by insertion within a coding sequence, 
activation of plant gene expression by action of the CaMV 35S enhancer, direct expression 
of a plant gene from the CaMV 35S promoter on the T-DNA, or down-regulation of 
expression by antisense RNA production driven from the CaMV 35S promoter. The pACI 02 
T-DNA carries in addition to the 35S promoter a complete copy of the pUC cloning vector to 
facilitate gene cloning by plasmid rescue (Dilkes & Feldmann 1998), and a sulfonamide 
resistance marker (Guerineau et al. 1990, Reiss et al. 1996) for selection of transgenic 
plants. 

We transformed 13.000 three-week old Arabidopsis ecotype Columbia plants from the 
luciferase recombination reporter line with the activation-tagging T-DNA construct "pAC102" 
by Agrobacterium-mediated gene-transfer, using the established "floral dip" procedure 
(Clough & Bent 1998) with a modified infiltration buffer, in which the Silwet L-77 detergent 
was replaced by 0.05% Extravon® (Ciba). Seeds from the infiltrated plants were harvested 
three weeks after infiltration. Transgenic progeny carrying the pAC102 activation tagging T- 



© -18- 



DNA were selected by sowing seeds on perlite substrate drenched with Gamborg B5 nfiineral 
medium (Gamborg et al. 1968) containing 10 mg/l sulfadiazine (Sigma), and transferring 
surviving individuals after 10 days to soil. About 20.000 sulfonamide-resistant plants were 
isolated; they represent independent transfonnants with the pAC102 T-DNA activation 
tagging construct Integrated at different random positions In the Arabidopsis genome. 

When individual plants had grown to the 10-leaf stage, they were assayed for luciferase 
activity to detect somatic recombination events. Batches of 25 plants were sprayed with the 
substrate D-luclferin and pictures (typically two) were taken with a "Astrocam" (Gloor 
Instruments. Uster) by integrating photons over 15 min. Background noise and cosmic 
radiation was filtered out by conreiating both images using the minimum function. Plants 
showing an Increased number of sectors with luciferase activity relative to the average of the 
population were observed with a frequency of about 1 1n 500 plants. 

As a result of the screen, one plant line, termed "hw17". showed a more than 10-fold 
Increase in number of luciferase sectors. The original transformed plant "hw17" was grown to 
maturity to obtain seeds. Progeny plants also showed an Increase in number of luciferase- 
expressing sectors, suggesting that this plant line cames a heritable mutation resulting In 
increased somatic recombination frequencies. To characterize the T-DNA Integration pattern 
in this plant line by Southern blotting, callus was Induced from leaves of the original 
transformant, and genomic DNA was prepared once sufficient plant material was produced. 
DNA was digested with Hindlll. transfenred to nylon membranes after electrophoretic 
separation, and probed using a DIG-labeled pUC-bIa (pUC beta lactamase) PGR product to 
detect genomic fragments carrying the right end of the pAC102 activation tagging T-DNA. 
Hindlll cuts twice within the pAC102 T-DNA. and the pUC-bIa segment detected by the 
probe lies between the T-DNA right border and one of these recognition sites. Therefore, 
each Independent Integration site is detected as a Hindlll fragment on Soutiiem blots 
consisting of the right end of the pAC102 T-DNA. including the pUC vector sequences and 
the CaMV 35S promoter, and a variable length of plant DNA extending from the right end 
Integration site of the pAC102 T-DNA to the nearest Hindlll restriction site in the plant 
genome. 

Two bands of about 5 kb and 10 kb were detected, suggesting two Independent T-DNA 
insertion events. To Isolate the plant genomic sequences adjacent to the pAC102 T-DNA 
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integration, we used the technique of plasmid rescue cloning (Dllkes & Feldmann 1998, 
Mathur et al. 1998). Brjefly. we digested plant genomic DNA with HinDIII, circularized the 
resulting fragments* by ligation at low DNA concentration, and transformed the ligation 
mixture Into competent E. co// TOPI 0 cells (commercially available from INVITROGEN) by 
electroporation. Since the Hindlll fragments containing the fusion joint between plant DNA 
and the right end of the activation tagging construct carry a plasmid origin and the ampicillin 
resistance gene (bla) contributed by pAC102. circularization of such fragments will result in a 
functional bacterial plasmid and confer ampicillin resistance to the E. coli cells. 

Several colonies were obtained after plating the transformed bacteria on selection medium 
containing ampicillin. Plasmid DNA of these transformants was prepared and characterized 
by restriction analysis. The plasmids fell into two classes after re-digestion with Hindlll: one 
class contained a Hindlll fragment of approximately 5 kb, the other one of approximately 10 
kb; corresponding to the size of the T-DNA integration fragments in the genome of the hw17 

plant detected by Southern blotting described above. 

I 

To determine the nature of the plant sequences joined to the right end of the T-DNA, the 
plant DNA insert from these rescued plasmids was sequenced from both sides, using one 
custom sequencing primer complementary to the T-DNA right end reading towards the plant 
DNA, and the standard Ml 3 reverse sequencing primer, reading from the pAC102 vector 
sequences into the plant DNA Insert from the other end. The obtained DNA sequences were 
compared to the GenBank nucleotide database using the BLASTN search program. 

The insert of one plasmid, pJL604.2, corresponding in size to the 10 kb band detected on 
the Southern blots, was highly similar to several Arabidopsis genomic ribosomal DNA gene 
repeat sequences. This suggests that one of the two pAC102 copies detected in the genome 
of the hyperrecombination mutant plant "hw17" is located within rDNA repeats. There are 
about 570 highly expressed copies of these sequences distributed throughout the 
Arabidopsis genome (Pruitt & Meyerowitz 1986), therefore we consider It very unlikely that 
changes of expression or mutation of one of them caused by an insertion of the activation 
tagging construct would cause a hyperrecombination phenotype. 

The insert of a second plasmid, pJL604.1. was identical to part of a 52717 bp PI clone 
(MBM17) derived from chromosome 5 of Arabidopsis thaliana (GenBank Accession number 
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AB019227; submitted on 29-OCT-1998 to the DDBJ/EMBUGenBank databases by 
Yasukazu Nakamura. Kazusa DNA Research Institute). The sequence contained in the 
circularized rescued plasmid pL604.1 extends from nucleotide 20310 of MBM17. that is 
joined to the pAC102 right end with the 35S promoter, to a Hindlll site at position 18503 in 
IVIBIVI17, that is joined to an internal Hindlll site within pAC102. To confimi that the T-DNA - 
plant DNA junction found on plasmid pJL604.1 really is derived from the genome of the 
hypen-ecombination mutant "hw17", we perfonned a PCR reaction with one primer annealing 
within the Arabidopsis genomic insert and one annealing close to the right border of the 
pAC102 T-DNA. Using pJL604.1 plasmid DNA or "hw17" plant genomic DNA. we observed a 
' PCI^Vroduct of identical size, confirmii^ that pjL604.i carries the* authentic pAC102 - plant 
DNA fusion joint. 

in the mutant plant "hw17". the right end of the pAC102 activation tagging T-DNA is fused to 
nucleotide 20310 of the plant genomic sequence, in such a way that the 35S promoter is 
pointing towards the beginning of the genomic clone MBM17. Further characterization of the 
genetic locus revealed complex rean-angments of DNA upon integration of the T-DNA. In 
particular, genomic Arabidopsis DNA was found inserted into the coding region of the 
predicted gene mbm17.6. 

It has been reported that T-DNA insertions are often accompanied by small deletions or 
rearrangement of DNA sequences in the vicinity of the T-DNA (Mayertiofer et al. 1991; 
Nacry et al. 1998). Also, the enhancer present in the Cam 35S promoter could affect the 
expression of genes over a distance and might act on several genes surrounding the 
integration site, although so far enhancer action obsen/ed in Arabidopsis plants in vivo in 
activation tagging experiments did not affect sequences further than 3.6 kb away (Weigel et 
al. 2000). 

GenBank database annotation of mbm17.5 predicted a gene with similarities to Rad26 
nucleotide excision repair proteins (Figure 1). Comparision of the predicted protein-coding 
segments against the GenPept/SwissProt protein database using the BLASTP program 
revealed many similar protein sequences of known function of the SWI2/SNF2 
helicase/ATPase protein family. A similarity search of the protein database revealed that the 
central region of this predicted protein of 1043 amino acids has significant similarity to a 
number of proteins involved in DNA binding, repair, recombination, and chromatin 
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remodeling. In particular, the human protein ERCC6 (Troelstra et al. 1992), involved in 
Cockayne's syndrome, and its S. cerevisiae homologue RAD26 (van Gool et al. 1994) are 
important for the* repair of active genes; the proteins RAD54 and rph54, from S. cerevisiae 
and S. pombe (Emery et al. 1991, Muris et al, 1996) and their mammalian homologues 
(Essers et al. 2000) are involved in DNA recombination and repair, and the yeast proteins 
MOT1 (Davis et al. 1992) and SNF2 (Laurent et al. 1991, Richmond & Peterson 1996) are 
known to affect the expression of numerous genes, most likely by ATP-dependent chromatin 
remodeling. All these proteins share an extended protein sequence motif with the predicted 
product of the MBM17.5 coding sequence, the so-called helicase/ATPase domain of the 
SWI2/SNF2 protein family (Gorbalenya & Koonin 1993, Aravind et al. 1999, Muchardt & 
Yaniv 1999, Travers 1999). 

We consider it most likely that the hyperrecombination phenotype detected in mutant line 
"hw17" is caused by insertion of the activation T-DNA into the predicted coding sequence 
MBM17.5. Since the recombination phenotype was observed in primary transformants, it is 
most likely dominant. An insertion of the pAC102 T-DNA at the observed position might 
cause a phenotype by disrupting the coding sequence of MBM17.5 and/or by the 
overexpression of a C-terminal fragment of this coding sequence that might have some 
activity by itself, or might interfere with the function of the intact MBM17.5 gene product 
Using Northern Blot analysis and RT-PCR technology we have shown that the activation tag 
of the pAC102 is active in the mutant hw17, giving rise to a very abundant transcript (Figure 
1) with a 705 bp open reading frame, homologous to the last 235 amino acids of the 
MBM17.5 protein. Although not wishing to be bound by theory, this truncated polypeptide 
may cause- of the hyperrecombination phenotype of the mutant hw17 by sequestering out 
the functional, complete gene product. 

Because of its strong similarity with other proteins known to be involved in DNA repair, 
chromatin structure and recombination, we consider that the MBM17.5 predicted coding 
sequence is the target for the mutation in the hyperrecombination mutant plant "hw17". The 
DNA sequence is a useful tool to manipulate somatic recombination in Arabidopsis. For 
example, over-expression of the truncated C-terminus of MBM17.5 is dominant, therefore 
allowing recombination frequency to be manipulated in selected cells by the use of tissue- 
specific promoters and/or transiently by use of inducible promoters. 
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The sequence of the cloned, full length cDNA of the mbm17.5 gene (Figure 1) encodes a 
protein of 1090 amino acids. There are two differences between the cloned and predicted 
protein sequence, due to the use of different splice sites in vh/o than in the predicted 
transcript. Using sequence alignment algorithms we found that the highest similarity to 
known proteins is restricted to the central part of mbm17.5 (aa 370- aa 900), containing the 
seven conserved helicase/ATPase motifs of the SWI2/SNF2 helicase family. The amino- and 
carboxy-termlnl of the predicted protein MBM17 seem to be less strongly consen/ed. 
Orthologs of MBM17.5 in other plant species have been identified (see example 3 and 4). 

The link between the Hyperrecomblnatlon phenStypS and-the'T-DNA insertion in MBM17.5 
has been confimied by segregation analysis of progeny up to the T6 generation. Analyses of 
plants over-expressing the cDNA, parts of it or anti-sense sequences can be used to 
demonstrate that the alteration of somatic homologous recombination frequency is due to 
mbm17. 



Example 2: Identification of At5g63960 (mbm17.6) as gene causing 
hyperrecombination In mutant hw17 

Due to the complex rearrangements upon integration of the mutagenizing T-DNA, other 
genes in the region of BAG clone mbm17 could be affected. While cloning of the T-DNA left 
border- genomic DNA cloning we found an insertion of Arabldopsis genomic DNA, located 
on TAG clone K19M22, Into the coding region of the neighboring gene mbm17.6. The DNA 
is Integrated 6 nucleotides down-stream of the putative start codon, probably abolishing the 
expression of a functional gene product of mbm17.6. 



GeneBank annotation of mbm17.6 predicted a gene (Figure 2A) homologous to the DNA 
polymerase 111. catalytic subunit of S. cerevisiae (Sitney et al. 1989). DNA polymerase III was 
shown to be involved in the accurate DNA replication (Simon et al. 1991; for review: Sugino 
1995) and in post-replicational repair of damaged DNA (Torres-Ramos et al. 1997). DNA 
replication and repair pathways are dependent on DNA polymerases, so the 
hyperrecombination phenotype of hw17 could be caused by the presence of less or non- 
functional DNA polymerase III protein. 
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Example 3: The rice homolog of mbm17.5 can be used to increasing homoiogous 
gene recombination 

Targeted genetic modification of the model plant Arabidopsis might become an Important 
tool for academic research but the need for targeted gene placement is much higher in crop 
plants. Using t-Blast algorithm to seek plant EST database, we found rice (RICS1367A; 
MAFF DNA bank, Japan) EST clones. We sequenced the rice EST clone RICS1367A and 
found an open reading frame having high homology to the MBM17.5 protein sequence 
covering not only the conserved helicase/ATPase motifs (Figs. 3, 10) but extending to the C- 
terminus of mbm17.5. The rice homolog of mbm17.5 can be used for increasing the 
efficiency of targeted modification of rice plants, following the strategies described earlier. 

Example 4: The maize lionnolog of mbm17. 5 can be used to increase homologous 
gene recombination 

Using t-Blast algorithm to seek plant EST database, we found a maize (60301 1H11: 
Stanford University, USA) EST clone that is an ortholog of the Arabidopsis gene mbm17.5 
(Figs. 4, 10). The maize homolog gene of mbm17.5 can be used to increase the efficiency of 
targeted modification of maize, following the strategies described above. 

Example 6: sm22 mutant Determination of sm22 transcript (helicase/ATPase) as an 
agent that improves homologous recombination 

From the same screen as described in Example 1, a second hyperrecombination mutant 
plant was isolated called sm22. The original hyper-recombination phenotype of sm22 plant 
shows an enhancement of about 20- to 50- fold for homologous recombination in the 
reporter line. No other obvious phenotype was seen and the seed yield was normal. 
Sulfonamide selection in the second generation (T2) revealed a 2/1 or 3/1 segregation of 
resistant seedlings, thus showing that there is only 1 locus (or 2 closely related loci) with an 
active T-DNA inserted. However, the T2 recombination phenotype was even lower (less or 
same number of recombination events per plant) than in the wild type. 

After Hindi 11 digestion of T1 callus genomic DNA prepared essentially according to the 
method of the Nucleon Phytopure protocol and Plant DNA extraction kit (Amersham), 
plasm id rescue was applied as described in example 1 , which gave rise to two independent 
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junctlon fragments. The first one corresponds to a single T-DNA insertion without deletion 
(left border, LB, junction sequenced) in the N-termlnal region of a putative ATPase/helicase 
gene At3g67300, in antlsense orientation. The second T-DNA inserted in a gene \Anth no 
obvious relationship to homologous recombination (gb AF082176_1) and does not confer 
sulfonamide resistance. Six (T3) resistant families were analysed by PCR and Southern. 
Only one family contained some plants with the second insertion whereas all families have 
the helicase insertion site. 

In subsequent generations, homozygous plants for the helicase insertion site were obtained. 
The homologous recombination frequency of heterozygous and homozygous plants for this 
insertion site was 80% and 20%, respectively, of the wild type level. 

The predicted helicase gene (8kb genomic DISJA) has about 20 exons encoding a protein of 
about 1489 amino acids. It is predicted to be an ATPase of the Swl2/Snf2 family, and 
contains several nuclear localization signals (NLS). The complete cDNA (4.8kb) was cloned 
in two steps. First, a public EST containing the 3' part was sequenced. Then the 5* part of 
the cDNA was amplified by RT-PCR on Col-0 (Arabidopsis Columbia ecotype. wild type) 
callus RNA (prepared with the Qiagen RNAeasy Plant Kit), using primers in the 5' 
untranslated region including a stop codon In frame with the predicted ATG (smSUT) to 
make sure that the complete 5' part of the cDNA was amplified. The primer sequences were 
smSUT: ctagaagcttttaaggatLAAgactctcc and for 3' primer: ctcgtatgtatcccccttctcc. 

The ATPase/helicase encoded by the gene (AGI: At3g57300) is the putative Arabidopsis 
ortholog of the yeast lno80prrGL150c protein (Ebbert et al. (1999), Shen et al. (2000). 
Homologs exist in yeast, budding yeast, Drosophila and human. These four homologues 
have several highly conserved regions including the six motifs of the SWI2/SNF2 helicase 
domain. Several NLS suggest a nuclear localization of the gene product. 

The yeast homolog (Ebbert et a/., 1999). lNO80(=YGL150c), which is part of a big complex 
>1l\/IDa (monomeric form is 171 KDa), containing two essential helicases Rvblp and Rvb2p, 
Implicates these genes in homologous recombination in Eukaryotes (Cho et al. 2001; 
Jonsson et al. 2001; Wood et al. 2000). Human Rvblp and Rvb2p are also known 
(Kanemaki 1999, Ikura et al. 2000. Shen et al. 2000). In Arabidopsis thaliana we found three 
genes closely related to Rvbs from other organisms (. The first one is the ortholog of yRvb1 
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and we named it AtRvbl. We found two counterparts for yRvb2 that we named AtRvb21 and 
AtRvb22. The three genes are expressed (RT-PCR) and some of them are positively 
regulated by genotoxic stress (UVc, bleomycin). For treatment with Bleomycin (BLM) 2 
week-old Arabidopsis seedlings were placed under sterile conditions in liquid GM medium 
containing 10-6M of BLM (Sigma) or 100 ppm of MMS (Fluka, Switzerland). For UV-C 
irradiation (6000 ergs) 2 week-old seedlings were irradiated with light provided by a HNS 
55W OFR lamp (Osram). After treatment, plants were harvested at several time points 
(30min, 1h, 4h and 12h) and RNA extracted as described above. Then semi-quantitative RT- 
JPCR analysis was p erformed wi th the following primers AtlnoSO 
(TGATGGATCTATCACCATCAG ggtgggattccaatcactttc) AtRvbl (tttgatgggccaaatgatg 
cttccaaCCTAGGtgagatgtttcaacaaaatgtgc) AtRvb21 (tcaacagcaggacacaagg 
cccaatgCCTAGGaaatccgagttcaacatcctaatc) AtRvb22 (acaaaccagatatcagcacatgg 
aacaagtactcgctctcatgctc). In the sm22 background the steady state level of AtRvb21 and 
AtRvb22 was shown to be down-regulated using RT-PCR on RNA extracted as above 
mentioned. 

This indicates that the components of the putative Arabidopsis Ino80 complex show co- 
regulation at the transcriptional level, supporting the use of Arabidopsis Rvb1. Rvb21 and 
Rvb22 to manipulate homologous recombination frequency in plants. 

Example 6: AtRvbl as positive regulator of homologous recombination. 

As describe above (Example 5), the original recombination-up phenotype found in sm22 can 
be associated with an effect mediated by the Arabidopsis Rvb1 and 2 orthologs. Thus, 
AtRvbl can be used as a positive regulator of homologous recornbination. 

Example 7: AtRvb21 as positive regulator of homologous recombination. 

As describe above (Example 5), the original recombination-up phenotype found in sm22 can 
be associated with an effect mediated by the Arabidopsis Rvb1 and 2 orthologs. Thus, 
AtRvb21 can be used as a positive regulator of homologous recombination. 

Example 8: AtRvb22 as positive regulator of homologous recombination. 

As describe above (Example 5). the original recombination-up phenotype found in sm22 can 
be associated with an effect mediated by the Arabidopsis Rvb1 and 2 orthologs. Thus, 
AtRvb22 can be used as a positive regulator of homologous recombination. 
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Example 9: At3g57290 as positive regulator of homologous recombination. 

In the sm22 mutant (Example 5). the At3g57290p gene is potentially overexpressed by the 
35S Enhancer/promoter. Over expression of this gene in the sm22 context or directly With a 
35S promoter can be carried out to reproduce the original recombination-up phenotype. The 
phenotype was lost in the second generation (Example 5). at which point At3g57290 Is not 
overexpressed any longer allowing a temporal ability to modulate homologous 
recombination. 



All publications refen-ed to herein are incorporated by reference as if each is 
Individually. 
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What is claimed is: 

1. An isolated nucleic acid comprising a sequence having 98.5% or more identity with 
. the sequence depicted in Figure 1C or Figure IE. 

2. The nucleic acid of claim 1 , wherein said nucleic acid is DNA. 

3. A vector comprising the nucleic acid if claim 2. 

4.^ A hp®* cell comprisingjhe vector or nucleic acid of claim 3. 

5. A polypeptide encoded by the isolated nucleic acid of claim 1 . 

6. An isolated nucleic acid comprising a sequence having 98.5% or more identity with 
the sequence depicted in Figure 5A. 

7. The nucleic acid of claim 6, wherein said nucleic acid Is DNA. 

8. A vector comprising the nucleic acid of claim 7. 

9. A host ceil comprising the vector or nucleic acid of claim 8. 

10. A polypeptide encoded by the isolated nucleic acid of claim 6. 

11. A method for inducing homologous recombination in a cell, said method comprising 
modulating the expression or properties of one or more gene products selected 
from the group consisting of MBM17.5, MBM17.6, osMBM17.5, zmMBM17.5, 
Atlno80. At3g57300, Rvb1 (At5g22330), Rvb21 (At5g67630), Rvb22 (At3g49830) 
and At3g57290. 

12. The method of claim 11, said method comprising increasing expression of said 
gene product. 

13. The method of claim 12, said method comprising introducing a nucleic acid 
encoding said gene product into said cell operably linked to a promoter; and 
allowing transcription and translation of said gene in an amount sufficient to affect 
homologous recombination in said cell. 
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14. The method of claim 13, wherein said homologous recombination is somatic 
homologous recombination. 

15. The method of claim 13. wherein said homologous recombination is meiotic 
homologous recombination. 

16. The method of claim 13, wherein said promoter is an inducible promoter. 

17. The method of claim 13, wherein said promoter is a tissue-specific promoter. 

18. The method of claim 13, wherein said promoter is a constitutive promoter. 

19. The method of claim 13, wherein said promoter is a meiosis-specific promoter. 

20. A method of increasing gene targetting to a desired locus in a host cell, said 
method comprising introducing a desired gene into a host cell, modulating the 
expression or properties of one or more gene products selected from the group 
consisting of MBM17.5. MBM17.6, osMBM17.5, zmMBM17.5, Atlno80, At3g57300, 
Rvbl (At5g22330). Rvb21 (At5g67630), Rvb22 (At3g49830) and At3g57290, or 
functional fragments, derivatives and homologues thereof in said host cell, and 
detecting integration of said desired gene at a selected locus in the genome of said 
host cell. 



Abstract 



The present invention relates to nucleic acids encoding polypeptides involved in homologous 
recombination, as well as vectors and host cells comprising the nucleic acids and 
polypeptides encoded by the nucleic acids. Also provided are methods for inducing somatic 
and/or meiotic homologous recombination in a cell, comprising modulating the expression or 
properties of one or more gene products selected from the group consisting of MBM17.5, 
MBM17.6, OSMBM17.5, zmMBM17.5. AtlnoSO, At3g57300; Rvb1 (At5g22330), Rvb21 
(At5g67630), Rvb22 (At3g49830) and At3g57290, their homologues, fragments or 
derivatives. In particular, the methods can be used to increase gene targetting. 
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Figure 1A 

>predicted cDNA of mbm17.5 

atggcggaaaatacggccagccatagaagaaaacctcggagcttgaacgatcgtcactacagtatcctccaggatclttctgcg 
crtcrtagacagcctccctcrttcttctcatggagaagatgaagagacgaagaagtccatgattaagcttgctggacgacgtcgtct 




catcactagacagcgctggaattgggaacaaattcacatcttgggatgaatcaaaggaagctaacactgagctggctggcgag 
cctaacttttcgataatcacagacttttgttcgccctc acctcagttga agcaaaaagaggaaatgcaaggtgatggaggaagga 

acgagatcaiMtatmSatgatttgsiM^^ 



agagcaaactagtaaggaattttcaagggaatgggaagaaagaatttcgaatgttggaaagcaaaactcatattctggtcggc 
actttgacgataactctgaagataataggcagggatacaatcttgaccgtgggaagagccaatgcaaggaagtcgaccaaagt 



gacgaggatgatgatgatgatgactgtctcattttgtccgggaaaaaggcggctgaaatgaaaattaataagccagctcggtctt 
ataacgccaaaagacatggttatgatgagagatcgttggaagatgaagggtctatcactttaaclggcctcaatttgtcttacacat 
tgcctggaaagattgcaacaatgttatatccacatcagagggaagggttgaattggctttggtcattgcatacccaagggaaagg 

tggaata 

gctctggtagtggccccaaaaacci 
actacggtacttctacgaaagc^cgggaatatgatctccaccacattctgcagggtaaaggtattcttctaacaacctatgatattg^ 

gcggaacaatacaaaggctttgcaaggtgacgaccattatactgatgaggatgatgaagatggaaacaaatgggactacatg 

attctggacgagggacatcttattaagaaccccaacacacaaagggcgaagagtttgcttgagatcccaagttctcaccgtattat 

aataagtggtacaccaatccagaacaatctcaaggtattattgtctatgacatttaacgttgctgccctgggttactcggtgacaag 

aattggtaaacatatcctMctataatmgtcagagtacaWaccagctctgtattactaaggdttaagctttaacaggtttaagcag 

aattatgagcattacattcttcgtggaactgacaaaaatgctactgatagagaacagaggataggctcaacagtagcaaagaa 

cttgagggagcatattcaaccmcttcttgcggcgccttaagagtgaagt<mcggtgatgatggtgcaacrtccaaactttcgaag 



aaggacgaaat 

cmtgatggttcacctctagcagctctaacgattctgaagaaaatatgtgaccacccgcttctcttaactaagagggctgctgagga 
tgtccttgaaggaatggattcaacattaacacaagaagaagcaggcgtggctgagagattggctatgcatatagcggacaatgt 



gtggcAcctataWctcttgacttctcaagttggtggtctcggccttactctgactaaggcagaccgtgtgattgtggtggaccc^^ 
ggaatccaagcacggacaaccagagtgttgatcgagcatatagaatlgggcagacaaaggatgtcatcgtatataggttaatg 
acctcagcaactgttgaagaaaagatatacagaaagcaggtatacaagggaggattgtttaaaactgcaactgagcataaag 
aacaaatccgctacttcagccagcaggaccttcgagaacttmagtcttcccaagggaggctttgatgtttcacctacacaacagc 
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aactatacgaagagcactataaccaaatcaaactagatgaaaaactggaatccc&ttgtaaagtttctcgaaacccttggtatag 

ctggagttagccaccatagcttacttttctccaagacagctcctattcaagcgatacagsiaagatgaagaagaacaaataaggg 

ctgactatgctttcaagccaaaggatgtgaatttggacaagagaatcaacatttcccca!i3tcgatgacaaggaattgtcagaaag 

cgtaattaaagcaagactcaatcgtttgacgatgctattacaaaacaagggtacggtcicaaggctacctgatggaggggcaaa 

aatccagaagcagattgctgaattgactcgagaactgaaagacatgaaagcagcagaiaaggatcaacatgcctcaagttattg 

acttggaggaggatataagtcggaagatgcaaaaaggattgaatctgtag 

Figure IB 

>predicted protein sequence of MBM17.5 

MAENTASHRRKPRSLNDRHYSILQDLSAPPRQPPSSSHGEDEETKKSMIKLAGRRRLCKAL 

PKEDEADGYDDPDLVDFYSPVKGETSLDSAGIGNKFTSWDESKEANTELAGEPNFSIITDFC 

SPSPQLKQKEEMQGDGGRNEIMGILDDLTSKLGTMSIQKKKOSQSNDFDACGVKSQVDKF 

DFEDAKSSFSLLSDLSKSSPDWTTYNAGVNSIKDKQGKSGFAIREEQTSKEFSREWEERIS 

NVGKQNSYSGRHFDDNSEDNRQGYNLDRGKSQCKEVDQSMKTTRHIEVSEKIRTVGRSNA 

AKLRDLDEDDDDDDCLILSGKKAAEMKINKPARSYNAKRHGYDER3LEDEGSITLTGLNLSY 

TLPGKIATMLYPHQREGLNWLWSLHTQGKGGILGDDMGLGKmQICSFLAGLFHSKLIKRA 

LWAPKTLLPHWMKELATVGLSQMTREYYGTSTKAREYDLHHIi.QGKGILLTTYDIVRNNTK 

ALQGDDHYTDEDDEDGNKWDYMILDEGHLIKNPNTQRAKSLI.EIPSSHRIIISGTPIQNNLKV 

LLSMTFNVAALGYSVTRIGKHILPIILSEYIYQLCITKALSFNRFKQNYEHYILRGTDKNATDRE 

QRIGSTVAKNLREHIQPFFLRRLKSEVFGDDGATSKLSKKDEIWWLRLTACQRQLYEAFLN 

SEIVLSAFDGSPLAALTILKKICDHPLLLTKRAAEDVLEGMDSTLTQEEAGVAERLAMHIADNV 

DTDDFQTKNDSISCKLSFIMSLLEFQEGHVAPJFLLTSQVGGLGLTLTKADRVIWDPAWNPS 

TDNQSVDRAYRIGQTKDVIVYRLMTSATVEEKIYRKQVYKoGLFKTATEHKEQIRYFSQQDL 

RELFSLPKGGFDVSPTQQQLYEEHYNQIKLDEKLESHVKFLETLGIAGVSHHSLLFSKTAPIQ 

AIQKDEEEQIRADYAFKPKDVNLDKRINISPVDDKELSESVIKARLNRLTMLLQNKGTVSRLP 

DGGAKIQKQIAELTRELKDMKAAERINMPQVIDLEEDISRKMQKGLNL 

Figure 1C 

>full length cDNA of mbm17.5 

1 ATGGCGGAAA ATACGGCCAG CCATAGAAGA AAACCTCGGA GCTTGAACGA 
51 TCGTCACTAC AGTATCCTCC AGGATCTTTC TGCGCCTCCT AGACAGCCTC 
101 CCTCTTCTTC TCATGGAGAA GATGAAGAGA CGAAGAAGTC CATGATTAAG 
151 CTTGCTGGAC GACGTCGTCT TTGCAAGGCC TTGCCAAAGG AAGACGAAGC 
201 TGATGGATAT GACGATCCTG ATTTGGTTGA TTTCTATTCC CCAGTTAAAG 
251 GAGAGACATC ACTAGACAGC GCTGGAATTG GGAACAAATT CACATCTTGG 
301 GATGAATCAA AG GAAGC TAA CACTGAGCTG GCTGGCGAGC ctaacttttc 
351 GATAATCACA GACTTTTGTT CGCCCTCACC TCAGTTGAAG CAAAAAGAGG 
401 AAATGCAAGG TGATGGAGGA AGGAACGAGA TCATGGGTAT TTTGGATGAT 
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451 TTGACCTCTA AGCTT6GGAC AATGTCGATT CAGAAGAAGA AGGATAGCCA 
501 AAGCAATGAT TTTGATGCAT GTGGAGTGAA GAGCCAGGTT GATAAATTTG 
551 ATTTTGAGGA TGCCAAATCC TCATTTTCCT TGCTATCGGA TCTATCTAAG 
601 TCCTCACCAG ATGTGGTTAC CACATATAAT GCTGGCGTTA ATAGTATCAA 
651 GGACAAGCAA GGCAAATCTG GTTTTGCCAT CCGGGAAGAG CAAACTAGTA 
701 AGGAATTTTC AAGGGAATGG GAAGAAAGAA TTTCGAATGT TGGAAAGCAA 
751 AACTCATATT CTGGTCGGCA CTTTGACGAT AACTCTGAAG ATAATAGGCA 
801 GGGATACAAT CTTGACCGTG GGAAGAGCCA ATGCAAGGAA GTCGACCAAA 
851 GTATGAAGAC GACCAGGCAC ATAGAGGTAA GTGAGAAGAT AAGAACAGTC 
901 GGAAGGTCTA ATGCTGCCAA GCTAAGAGAC TTAGACGAGG ATGATGATGA 
951 TGATGACTGT CTCATTTTGT CCGGGAAAAA GGCGGCTGAA ATGAAAATTA 
1001 ATAAGCCAGC TCGGTCTTAT AACGCCAAAA GACATGGTTA TGATGAGAGA 
1051 TCGTTGGAAG ATGAAGGGTC TATCACTTTA ACTGGCCTCA ATTTGTCTTA 
1 101 CACATTGCCT GGAAAGATTG CAACAATGTT ATATCCACAT CAGAGGGAAG 
1 1 51 GGTTGAATTG GCTTTGGTCA TTGCATACCC AAGGGAAAGG TGGAA TACTT 
1201 GGAGATGATA TGGGTTTAGG TAAAACTATG CAGATTTGTA GTTTTCTTGC 
1251 TGGTTTATTC CACTCCAAAT TGATCAAGCG TGCTCTGGTA GTGGCCGCAA 
1301 AAACCTTGCT GCCTGACTGG ATGAAAGAAT TAGCTACCGT GGGACTTTCA 
1351 CAAATGACTA GGGAATACTA CGGTACTTCT ACGAAAGCCC GGGAATATGA 
1401 TCTCCACCAC ATTCTGOAGG GTAAAGGTAT TCTTCTAACA ACCTATGATA 
1451 TTGTGCGGAA CAATACAAAG GCTTTGCAAG GTGACGACCA TTATAGTGAT 
1501 gaggatgaYg AAGATGGAAA CAAATGGGAC TACATGATTC TGGACGAGGG 
1551 ACATCTTATT AAGAACCCCA ACAGACAAAG GGCGAAGAGT TTGCTTGAGA 
1601 TCCCAAGTTC TOACCGTATT ATAATAAGTG GTACACCAAT CCAGAACAAT 
1651 CTCAAGGAAC TGTGGGCTGT CTTCAACTTC AGCTGCCCTG GGTTACTCGG 
1701 TGACAAGAAT TGGTTTAAGC AGAATTATGA GCATTACATT CTTCGTGGAA 
1751 CTGACAAAAA TGCTACTGAT AGAGAACAGA GGATAGGCTC AACAGTAGCA 
1801 AAGAACTTGA GGGAGCATAT TCAACCTTTC TTCTTGCGGC GCCTTAAGAG 
1851 TGAAGTCTTC GGTGATGATG GTGCAACCTC CAAACTTTCG AAGAAGGACG 
1901 AAATTGTTGT ATGGTTACGG TTAACAGOTT GGCAGAGGCA ATTATATGAA 
1951 GCTTTCTTAA ACAGTGAAAT TGTTCTGTCA GGTTTTGATG GTTCACCTCT 
2001 AGGAGCTCTA ACGATTCTGA AGAAAATATG TGACGACCCG CTTCTGTTAA 
2051 CTAAGAGGGC TGCTGAGGAT GTCCTTGAAG GAATGGATTC AACATTAACA 
2101 CAAGAAGAAG CAGGCGTGGC TGAGAGATTG GCTATGCATA TAGGGGACAA 
2151 TGTGGATAGA GATGATTTTG AGACCAAGAA TGAGAGTATG TCTTGGAAAT 
2201 TGTCATTTAT GATGTCGGTA CTGGAAAATT TAATTGCAGA GGGGCACGGT 
2251 GTTCTAATCT TCTGCCAGAG ACGCAAGATG GTTAATGTGA TTCAGGATTG 
2301 TCTTACCTCC AACGGTTATA GTTTCTTGCG AATTGATGGT AGAACAAAAG 
2351 GGCCTGACAG ATTGAAGACT GTTGAAGAAT TTGAAGAAGG TGATGTGGCT 
2401 CCTATATTTC TCTTGACTTC TCAAGTTGGT GGTCTCGGGG TTACTGTGAG 
2451 TAAGGCAGAC CGTGTGATTG TGGTGGACCC TGGGTGGAAT CCAAGCAGGG. 
2501 ACAACCAGAG TGTTGATGGA GGATATAGAA TTGGGGAGAC AAAGGATGTC 
2551 ATCGTATATA GGTTAATGAC CTCAGCAAGT GTTGAAGAAA AGATATAGAG 
2601 AAAGCAGGTA TACAAGGGAG GATTGTTTAA AACTGCAACT GAGC ATAAAG 
2651 AAGAAATCCG CTACTTCAGC CAGCAGGACC TTGGAGAAGT TTTTAGTGTT 
2701 CCGAAGGGAG GCTTTGATGT TTCACCTACA CAACAGCAAC TATACGAAGA 
2751 GCACTATAAG CAAATGAAAC TAGATGAAAA ACTGGAATGG GATGT AAAGT 
2801 TTCTCGAAAG CGTTGGTATA GGTGGAGTTA GCCAGGATAG GTTAGTTTTC 
2851 TCCAAGACAG CTGCTATTCA AGCGATACAG AAAGATGAAG AAGAACAAAT 
2901 AAGGAGAGAA AGAGGATTGC TGTTGGGACG CGGATGAGCA AGTATTTGAG 
2951 AAGACAGCGT CATCAATGGG GCTGACTATG CTTTGAAGGC AAAGGATGTG 



3001 AATTTGGACA AGAGAATCAA CATTTCCCCA GTCGATGACA AGGAATTGTC 
3051. AGAAAGCGTA ATTAAAGCAA GACTCAATCG TTTGACGATG CTATTACAAA 
3101 ACAAGGGTAC GGTCTCAAGG CTACCTGATG GAGGGGCAAA AATCCAGAAG 
3151 CAGATTGCTG AATTGACTCG AGAACTGAAA GACATGAAAG CAGCAGAAAG 
3201 GATCAACATG CCTCAAGTTA TTGACTTGGA GGAGGATATA AGTCGGAAGA 
3251 TGCAAAAAGG ATTGAATCTG TAG 



Figure ID 

>protein sequence of MBM17.5 

MAENTASHRRKPRSLNDRHYS1LQDLSAPPRQPPSSSHGEDEETKKSMIK50 

LAGRRRLCKALPKEDEADGYDDPDLVDFYSPVKGETSLDSAGJGNKFTSW 1 00 

DESKEANTELAGEPNFSIITDFCSPSPQLKQKEEMQGDGGRNEIMGILDD 1 50 

LTSKLGTMSIQKKKDSQSNDFDACGVKSQVDKFDFEDAKSSFSLLSDLSK 200 

SSPDWTTYNAGVNSIKDKQGKSGFAIREEQTSKEFSREWEERISNVGKQ250 

NSYSGRHFDDNSEDNRQGYNLDRGKSQCKEVDQSMKTTRHIEVSEKIRTV300 

GRSNAAKLRDLDEDDDDDDCLILSGKKAAEMKINKPARSYNAKRHGYDER 350 

SLEDEGSITLTGLNLSYTLPGKIATMLYPHQREGLNWLWSLHTQGKGGIL400 

GDDMGLGKTMQICSFLAGLFHSKLIKRALWAPKTLLPHWMKELATVGLS450 

QMTREYYGTSTKAREYDLHH1LQGKGILLTTYDIVRNNTKALQGDDHYTD500 

EDDEDGNKWDYMILDEGHLIKNPNTQRAKSLLEIPSSHRIIISGTPIQNN 550 

LKELWALFNFSCPGLLGDKNWFKQNYEHYILRGTDKNATDREQRIGSTVA600 

KNLREHIQPFFLRRLKSEVFGDDGATSKLSKKDEIWWLRLTACQRQLYE 650 

AFLNSEIVLSAFDGSPLAALTILKKICDHPLLLTKRAAEDVLEGMDSTLT700 

QEEAGVAERLAMHIADNVDTDDFQTKNDSISCKLSFIMSLLENLIPEGHR 750 

VLIFSQTRKMLNLIQDSLTSNGYSFLRIDGTTKAPDRLKTVEEFQEGHVA800 

PIFLLTSQVGGLGLTLTKADRVIWDPAWNPSTDNQSVDRAYRIGQTKDV850 

IVYRLMTSATVEEKIYRKQVYKGGLFKTATEHKEQIRYFSQQDLRELFSL900 

PKGGFDVSPTQQQLYEEHYNQIKLDEKLESHVKFLETLGIAGVSHHSLLF 950 

SKTAPIQAIQKDEEEQIRRETALLLGRASASISQDTVINGADYAFKPKDV 1 000 

NLDKRINISPVDDKELSESVIKARLNRLTMLLQNKGTVSRLPDGGAKIQK1050 

QIAELTRELKDMKAAERINMPQVIDLEEDISRKMQKGLNL* 



■Figure IE 

>over-expressecl transcript of mbm17.5 In mutant hw17 

1 AGAGGACAGG GTACCCGGGG ATCAGATTGT CGTTTCCCGC CTTCAGTTTA 
51 AACTATCAGT GTTTGAATTG AAGTATTGTT TATATGTTAC GCATGGAATT 
101 TTCAGGATTC TCTTACGTCC AACGGTTATA GTTTCTTGCG AATTGATGGT 
151 ACAACAAAAG CCCCTGACAG ATTGAAGACT GTTGAAGAAT TTCAAGAAGG 
201 TGATGTGGCT CCTATATTTC TCTTGACTTC TGAAGTTGGT GGTCTCGGCC 
251 TTACTCTGAC TAAGGCAGAC CGTGTGATTG TGGTGGACCC TGCCTGGAAT 
301 CCAAGCACGG ACAACCAGAG TGTTGATCGA GCATATAGAA TTGGGCAGAC 
351 AAAGGATGAG ATCGTATATA GGTTAATGTC CTCAGCAACT GTTGAAGAAA 
401 AGATATACAG AAAGCAGGTA TACAAGGGAG GATTGTTTAA AACTGCAACT 
451 GAGCATAAAG AACAAACCCG CTACTTCAGG CAGCAGGACC TTCGAGAACT 
501 TTTTAGTCTT CCCAAGGGAG GCTTTGATGT TTCACCTACA CAACAGCAAC 
551 TATACGAAGA GCACTATAAC CGAATCAAAC TAGATGAAAA ACTGGAATCC 
601 CATG TAAAG T TTCTCGAAAC CCTTGGTATA GCTGGAGTTA GCCACCATAG 
651 CTTACTTTTC TCCAAGAGAG CTCCTATTCA AGCGATACAG AAAGATGAAG 
701 AAGAACAAAT AAGGAGAGAA ACAGCATTGC TCTTGGGACG CGCATCAGCA 
751 AGTATTTCAC AAGACACCGT CATCAATGGG GCTGACTATG CTTTCAAGCC 
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801 AAAGGATGTG ^TTTGGACA AGAGAATCAA CATTTCCCCA GTCGATGACA 
851 AGGAATTGTC AGAAAGCGTA ATTAAAGCAA GACTCAATCG TTTGACGATG 
901 CTATTACAAA ACAAGGGTAC GGTCTCAAGG CTACCTGATG GAGGGGCAAA 
951 AATCCAGAAG CAGATTGCTG AATTGACTCG AGAACTGAAA GACATGAAAG 
1001 CAGCAGAAAG GATCAACATG CCTCAAGTTA TTGACTTGGA GGAGGATATA 
1051 AGTCGGAAGA TGCAAAAAGG ATTGAATCTG TAGAGTAAGA TACAAGTCAA 
1101 GATGCAAGAA ATGCAAACGA CCATCATTGC AACACTTGTG GTTTTTTTTT 
1 1 51 GTTCCTTATC TAATTTGGTT TGGTTGAATT GGTAAGTCAA TTACCATATG 
1201 ACTTGCTGCA AAAAAAAAAA AAAAAAA 



Fig.2: sequences related to mbm17.6 

Figure 2A. ~ 

>predicted cDNA of mbm17.6 (DNA polymerase III) 

1 ATGAATAGAT CCGGTATTTC CAAAAAGCGA CCGCCTCCTT CGAATACCCC 
51 ACCACCGGCG GGTAAGCATC GAGCCACTGG TGATTCAACA CCATCTCCGG 
101 CCATCGGAAC CCTAGATGAT GAATTTATGA TGGAAGAGGA CGTGTTTCTG 
151 GACGAAACTC TCTTGTACGG CGACGAAGAT GAGGAATCCC TAATCCTCCG 
201 TGACATTGAG GAGCGTGAAT CGCGTTCCTC GGCTTGGGCT CGACCTCCGC 
251 TCTCCCCCGG GTATCTCTCG AATTCACAGA TTTTCCAACA ATTGGAGATT 
301 GACTCTATAA TCGCGGAGAG TCATAAGGAG GTGTTACCGG GTTCCTCAGG 
351 GCAAGGTCCA ATCATTAGGA TGTTTGGGGT TACCAGAGAA GGTAACAGTG 
401 TGTGTTGCTT TGTTCATGGA TTTGAGCCAT ACTTTTACAT TGCTTGCCCT 
451 CCTGGAATGG GGCCAGACGA TATTTCTAAT TTCCATCAGA GTGTTGAGGG 
501 AAGGATGAGG GAATCCAATA AAAATGCCAA GGTCCCGAAA TTTGTTAAAC 
551 GTATAGAAAT GGTGCAGAAA AGAAGCATTA TGTATTACCA ACAGCAAAAA 
601 TCCCAAACTT TTCTGAAGAT TACAGTTGCA TTGCCGACTA TGGTGGCAAG 
651 CTGTCGCGGC ATCCTTGATA GAGGCCTACA AATTGATGGA TTGGGTATGA 
701 AGAGCTTCCA GACATATGAA AGCAATATTC TTTTCGTTCT CCGTTTCATG 
751 GTTGATTGTG ATATTGTCGG AGGAAATTGG ATTGAAGTAC CTACTGGGAA 
801 GTATAAGAAA AATGCAAGAA CTTTGTCATA CTGCCAATTG GAGTTCCATT 
851 GCCTGTACTC AGATCTAATC AGTCATGCTG CAGAAGGTGA ATACTCAAAA 
901 ATGGCTCCAT TCCGTGTACT AAGTTTCGAT ATTGAGTGTG CAGGTCGTAA 
951 AGGACATTTT CCGGAAGCTA AGCATGATCC TGTAATCCAG ATAGCGAACC 
1001 TTGTTACTCT TCAGGGAGAG GATCACCCAT TTGTACGCAA TGTCATGACT 
1051 GTTAAGTCAT GTGCTCCAAT CGTAGGCGTA GATGTCATGT CTTTTGAAAC 
1 101 AGAAAGAGAG GTCTTACTAG GTTGGAGGGA TTTGATTCGT GATGTTGATC 
1 1 51 CTGATATCAT CATTGGTTAT AACATCTGCA AA7TCGATTT ACCTTATCTG 
1201 ATTGAGAGAG CTGCTACACT GGGAATAGAG GAATTTCCTC TTCTTGGTCG 
1251 TGTAAAGAAC AGTAGGGTCC GGGTCAGGGA CTCAACATTT TCATCAAGAC 
1301 AACAAGGAAT AAGAGAAAGT AAAGAGACCA CAATTGAAGG AAGATTTCAG 
1351 TTTGACCTTA TTCAGGCAAT AGACAGAGAC GACAAATTAA GTTCTTATTG 
1401 GCTGAATTCT GTCTCAGCTC ACTTTCTTTO CGAGCAGAAA GAAGATGTCG 
1451 ACCATTCTAT AATAACTGAT CTCCAGAATG GGAATGCGGA AACCAGGAGG 
1501 CGTCTTGCTG TTTATTGTTT GAAGGATGCA TATCTTCCTG AGAGGCTTCT 
1551 GGACAAACTG ATGTTTATAT ATAATTATGT CGAAATGGCT CGTGTAACTG 
1601 GTGTCCCTAT TTCATTTCTT CTTGCGAGAG GAGAGAGTAT CAAGGTTTTA 
1651 TCTCAGGTTG TTAGGAAAGG CAAACAGAAA AATCTGGTTC TTCCAAATGG 
1701 TAAACAGTCA GGGTCCGAAC AAGGAACTTA TGAAGGCGCA ACTGTTTTAG 
1751 AAGCAAGAAC AGGTTTCTAT GAAAAGCCAA TTGCAAGTTT GGATTTTGGT 
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1801 TCACTGTACC CGTCAATTAT GATGGCATAT AATCTGTGCT ACTGCACCTT 
1851 GGTGACACCT GAAGATGTAC GCAAACTGAA TCTTCCACCT GAACATGTCA 
1 901 CTAAAACTCC ATGAGGGGAA ACATTTGTTA AGCAAACTTT GCAAAAGGGT 
1951 ATACTTCCAG AAATTGTCGA AGAGCTTCTT ACTGCCCGTA AGAGAGCTAA 
2001 AGCAGATTTA AAGGAGGCTA AGGATCCCCT TGAGAAGGCT GTTTTAGATG 
2051 GTAGACAGTT AGCGTTGAAG ATCAGTGCAA ATTCTGTCTA CGGGTTTACG 
2101 GGAGCCACTG TTGGGCAGTT ACCATGCTTA GAAATATCCT CGAGTGTAAC 
2151 TAGCTATGGT CGTCAGATGA TTGAACAAAC AAAGAAACTT GTTGAAGACA 
2201 AATTCACAAC ACTGGGAGGG TATCAATACA ATGCAGAGGT CATTTATGGA 
2251 GACACGGATT CAGTCATGGT GCAATTTGGA GTATCGGATG TAGAAGCTGG 
2301 GATGACCTTG GGGAGGGAAG CTGCAGAACA CATTAGTGGA ACTTTTATCA 
2351 AACCCATCAA ATTGGAGriT.GAAAAGGTCT ATTTCCCATA TCTTCTCATT 
2401 AACAAGAAGA GGTATGCTGG TTTGCTATGG ACAAATCCTC AACAGTTTGA 
2451 CAAAATGGAC ACCAAAGGAA TCGAGACAGT ACGAAGGGAT AATTGTTTAC 
2501 TGGTTAAGAA CCTCGTGACT GAGAGTCTTA AGAAAATACT TATTGATAGA 
2551 GATGTTCCAG GGGCAGCTGA AAATGTCAAG AAAACCATTT CGGATCTTOT 
2601 CATGAACCGT ATTGACTTGT CACTTTTGGT GATTACTAAG GGTCTAACGA 
2651 AAACAGGAGA TGATTATGAA GTTAAATCAG CTCATGGTGA ACTTGCTGAA 
2701 CGCATGCGTA AGAGGGATGC TGCTACAGCG CCAAATGTTG GAGATCGAGT 
2751 ACCGTATGTT ATCATAAAAG CTGCTAAAGG TGCCAAGGCT TATGAACGAT 
2801 CAGAAGATCC AATCTACGTG CTACAGAATA ATATCCCTAT AGACCCAAAT 
2851 TACTACTTGG AGAATCAGAT TAGCAAGCCA CTTGTTAGGA TTTTTGAGCC 
2901 AGTCGTGAAA AATGCTAGCA AGGAGCTTCT CCATGGAAGT CACACGAGGT 
2951 CAATATCAAT CACTACTCCT TCAAACAGCG GTATAATGAA GTTTGCTAAA 
3001 AAACAACTGA GCTGTGTTGG CTGCAAAGTT CCGATCAGGT ACTTTGTGCA 
3051 ATGGAACACT ATGCGCAAGT TGCAAGGGAA GAGAAGCCGA GTTATATTGC 
3101 AAAAACGTGT CTCAAGGTAT GCTGCCTGGC TGAGCTTGAA GAGGTTTTTG 
3151 GGAGGCTGTG GACACAGTGC CAGGAGTGTC AAGGCTCTCT TCATCAAGAT 
3201 GTCTTGTGCA CCAGTCGAGA TTGTCCAATA TTTTACCGGA GAATGA 



Figure 2B 

>predictecl protein sequence of MBM17.6 (DNA polymerase III) 
"MNRSGISKKRPPPSNTPPPAGKHRATGDSTPSPAIGTLDDEFMMEEDVFL 50 
DETLLYGDEDEESLILRDIEERESRSSAWARPPLSPAYLSNSQIFQQLEI 100 
DSIIAESHKELLPGSSGQAPIIRMFGVTREGNSVCCFVHGFEPYFYIACP 1 50 
PGMGPDDiSNFHQSLEGRMRESNKNAKVPKFVKRIEMVQKRSIMYYQQQK 200 
SQTFLKITVALPTMVASCRGILDRGLQIDGLGMKSFQTYESNILFVLRFM 250 
VDCDIVGGNWIEVPTGKYKKNARTLSYCQLEFHCLYSDLISHAAEGEYSK 300 
MAPFRVLSFDIECAGRKGHFPEAKHDPVIQIANLVTLQGEDHPFVRNVMT350 
LKSCAPIVGVDVMSFETEREVLLAWRDLIRDVDPDIIIGYNICKFDLPYL 400 
lERAATLGIEEFPLLGRVKNSRVRVRDSTFSSRQQGIRESKETTIEGRFQ 450 
FDLIQAIHRDHKLSSYSLNSVSAHFLSEQKEDVHHSIITDLQNGNAETRR 500 
RLAVYCLKDAYLPQRLLDKLMFIYNYVEMARVTGVPISFLLARGQSIKVL 550 
SQLLRKGKQKNLVLPNAKQSGSEQGTYEGATVLEARTGFYEKPIATLDFA600 
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SLYPSIMMAYNLCYCTLVTPEDVRKLNLPPEHVTKTPSGETFVKCmQKG650 
ILPEILEELLTARKRAKADLKEAKDPLEKAVLDGRQLALKISANSV'YGFT 700 
GATVGQLPCLEISSSVTSYGRQMIEQTKKLVEDKFTTLGGYQYMAEVIYG750 
DTDSVMVQFGVSDVEAAMTLGREAAEHISGTFIKPIKLEFEKWrPYLLISOO 
NKKRYAGLLWTNPQQFDKMDTKGIETVRRDNCLLVKNLVTESLNKILIDR 850 
DVPGAAENVKKTISDLLMNRIDLSLLVITKGLTKTGDDYEVKSAHGELAE900 
RMRKRDAATAPNVGDRVPYVIIKAAKGAKAYERSEDP1YVLQNNIPIDPN950 
YYLENQISKPLLRIFEPVLKNASKELLHGSHTRSISITTPSNSGIMKFAK 1 000 
KQLSCVGCKVPIRYFVQWNTMRKLQGKRSRVILQKRVSRYAAWLSLKRFL 1 050 
GGCGHSARSVKALFIKMSCAPVEIVQYFTGE* 

Figure 3: Osmbm17.5 

>EST clone RI CS1 367A. Oryza sativa homolog of mbm17.5, pai Hal sequence 
1 aggaactttc agttgtgagc ctcaaagata agatcagaga ctactctggt 
51 cccaatgcaa atgctcgcaa ctatgagctt aaatatgcc^t tcaaggaggg 
101 tggaatcctt; ttaacaacat atgacattgt tcgaaacaat ttcaagatga 
151 taaaaggcaa cttcaccaat gattttgatg acgaggaaga aacattatgg 
201 aactatgtta ttcttgatga ggggcatatt atcaagaatc caaagactca 
251 gagggctcaa agtctatttg aaataccctg tgcacatcgt attgtcatca 
301 gtggaacacc catacaaaat aacttgaagg aaatgtgggc tctgttttat 
351 ttctgttgcc cagaagtctt gggtgataag gagcagttca aagcaaggta 
401 tgagcacgct atcattcaag gaaatgacaa gaatgctacc aatcgacaaa 
451 agcacatagg ctcaaatgta gcaaaggaat taagagaacg gataaagcca 
501 tactttttgc gacgcatgaa gaatgaagtg tttcttgata gcggcacggg 
551 agaagataaa aagcttgcta agaagaatga gotaattatc tggctgaaat 
601 taacatcttg ccagaggcaa ttatatgaag cttttcttaa cagtgaacta 
651 gttcattcat caatgcaagg gtcacccttg gccgcaatca cgatattgaa 
701 gaaaatatgt gatcatccgc tgttgttgac taagaaagct gctgagggtg 
751 TTTTGGAAGG CATGGATGCG ATGTTAAATA ATCAAGAAAT GGGAATGGTT 
801 GAGAAAATGG CCATGAACCT TGCAGATATG GCTCATGATG ATGATGACGT 
851 TGAATTGCAA GTTGGTCAGG ATGTCTCGTG CAAGTTATCT TTTATGATGT 
901 CCTTGCTCCA AAATCTTGTT AGCGAGGGAC ACAACGTCTT AATCTTCTCG 
951 CAAACTCGTA AAATGCTAAA CATTATTCAG GAGGCTATAA TATTAGAAGG 
1001 CTATAAGTTT TTGCGCATTG ATGGTACCAC CAAGATTTCT GAGAGGGAAA 
1051 GGATTGTGAA GGACTTCCAA GAGGGTCCTG GAGCTCCAAT AIM I IG CTG 
1 101 ACCACACAAG TTGGTGGGCT TGGACTTACA CTCACCAAGG CAGCTCGTGT 
1 151 CATAGTAGTT GATCCTGCTT GGAATCCAAG TACGGACAAT CAAAGTGTTG 
1201 ATCGTGCTTA TCGAATTGGG CAGATGAAAG ATGTCATCGT ATACCGCCTT 
1251 ATGACATCTG GAACCATCGA AGAAAAGATA TACAAATTGC AGGTCTTCAA 
1301 GGGGGCTCTG TTTAGGACAG CTACAGAGCA CAAAGAACAA ACTCGTTATT 
1351 TCAGCAAGAG GGATATTCAA GAGCTTTTCA GTCTGCCTGA GCAAGGTTTT 
1401 GATGTTTCGC TGACACAAAA GCAATTGCAA GAAGAGCATG GACACCAACT 
1451 TGTGATGGAC GACTCCTTGA GGAAGCATAT ACAATTCCTG GAGCAACAAG 
1501 GCATAGCGGG CGTGAGCCAT CACAGCCTTC TGTTTTCTAA GACAGCAATC 
1551 TTACCTACAC TGAATGATAA TGATGGTTTG GACAGTCGTC GAGCTATGCC 
1601 AATGGCCAAG CACTACTACA AGGGAGCCTC ATCTGACTAT GTTGCCAATG 



1651 GTGCTGCCTA TGCGATGAAG CCAAAAGAGT TCATTGCTCG AACATACTCC 
1?01 CCGAACAGCA CAAGCACAGA AAGTCCTGAG GAAATCAAGG CCAAAATCAA 
1751 CCGGCTTTCG CAAACCCTTG CAAACACGGT GCTTGTGGCG AAGCTACCAG 
1801 ATCGTGGAGA CAAGATAAGG AGGCAGATAA ATGAGCTGGA CGAAAAGCTG 
ACCGTGATCG AGTCTTCTCC GGAGCCATTG GAGAGGAAGG GTCCAACGGA 
lloi AGTAATCTGC TTGGATGATC TGTCTGTCTA GTGTAGGGCA TGTCTGTTTC 
llVi ?xrTGCTT^/>^TTCCATGCT TGCATGCTAG TAGTCACTAA GGCGTGACAT 
inn. ATf^-r ACT^^^ ATTGTGACGA CCACGGAACG GAACACATGC 

2101 TTGACCAAAAAAAAAAAAAA 

>F<n- cllone 6*030^^ Zea mays homolog of mbm17.5, partial sequence 

1 Va™ggaca ACCAGGACGA CGGTGAAAGC ATACTCGACA TCCTAGACGA 
.;i rrrr A?CACA CGA^GACICICTATCCGJ CCAGAAGCCC AGCACCGCCG 
ini rr A(^§TCC^^^ TGCCGTGCGC CATCACCGTG 

IVi ricGtl%ACC TAGATGACCA TAGCCCAGAT GATGTGGATG CTCACGCCGG 
lr^^ ?r r?^2?f^A CCCCTTC^ TGATGAAGCT AGGGCTCCCA 

Isi SSgIcSTtc^^agg^^ ATTTAGTCTC ctcagcctgt 

loi ACCCATTATG C(^^^^^ CGTCCGTGGC AAGGGGAAGA ACAAAGGGAC 

??1 CACCAAGGAT GT^^ TAAATAGGGT ATCAAAGGCG TGATCGTTTG 

ini 2^TCTTA^^^^ GATTATGAGG ACTGCGAGGA GGACCAAGGA 

lSl IL^SSLaG ATT^G^^^ GATTCACAAG 
ini r A^^C^cS /^C^^^^ CATTCAGGAA CCATGGTGTC AGCGACGATG 
Si ?gct§ggtSa gS.g>^^^ CTGTGGAGAA CAATGCTGAG 

II] GA?G??GGA^ GGGG^SaGA CAGAGGACTT CAAGATGGAT CCAACTGG^ 
65 ctS^C ATGCAAGCCATACAAGCT^^^^ 

-rn^ r-TT-r-rnnrrr APP AGCGCGA GGGCCTCCGA TGGCTCTGGG \\ O I CjUAU i 

nm TGCAGGTTGC TGCATTTTTG GGTGGACTGT TTCATTCTCG TCTAGTCAAU 
II] IgGgSa ;?G?rGCTCC AAAGACACTT CTGG^^^^^^ 
901 GCTTTCAATT GTTGGCCTTA AAGAAAAGAT CAGAGACTAG TCTGGCCCCA 
II] GCAC^AT TCGCAATTAT GAACTCCAAT ATGCCTTCAA GGAGGGTGGT 
inni ATrrt^TAA SiACCTATGA CATTGTeAGG AACAACTACA AGCTCATAAG 
Ss ^^^G?S5t^C TA^^^ TGATGATGAG GAAGGAACTT 

?0 ^CTgS^X™ CGTAATTCTT GATGAGGGAC ATCTAAT^^^ 
1 151 ACACAAAGGG CGCAAAGTTT GTACGAAATA CCTTGTGCCC ATC^^ 
\oM RATPAGTGGA ACACCTATTC AAAATAACTT GAAGGAAATG TGGACTCTGT 
1251 ?cIS^^CTG T^^^^ ATAAACAGGA OT^pAAJ^;!^^ 

1?01 AGCTATGA^^^^^^ TCGAGGAAAT GACAAAAATG CTACCGCTCG 

fs'l aIS^G GTAGGCTCAA ATG^^^^^^ 

501 Sa^GTTAACA CCATGCCAGA GGAAACTATA TGAAGCTTTT CTAAATAGTG 
ll?i ?GC?GGTOA TTTA^^ CAGCCAAAGG CATCACCGTT GGCTGGAATC 

60 AcS?ATTGA AG^^ TGATCATCCA CTGGTATTAA CTAAGAAAGG 
llsi TGCTGAGGGT^^^ GAATGGGTGA AATGTTGAAT GATCAAGACA 

?o T?g5S^?gg-; IgSJaaaatg gccatgaacc ttgcagatat GGCTCAT^^ 
1751 GATAATGCAC tggaagttgg tcaggatgtc tcatgcaagc TATCATTCAT 

80 CATGTCCTTG TTGCGGAACC TTGTTGGAGA GGGGCATCAT GTTTTAATAT 
lis mCACAGAC TCGTAAAATG GTAAACCTTA TTCAGGAAGC TATAATATTA 



1901 GAGGGCTATG CG I III I GCG CATTGATGGC ACCAGCAAGG TTTG TGACCG 
1951 GGAAAGGATT GTGAAGGACT TCCAAGAGGG TTGTGGAGCT CCAGTTTTTC 
2001 TGCTAACCAC ACAAGTTGGT GGGCTTGGAC TTACACTCAC CAAGGCAACT 
2051 CGTGTCATTG TAGTTGATCG TGGATGGAAC CCTAGTACAG ACAATCAAAG 
2101 TGTTGATCGT GOTTACCGAA TTGGACAGAC TAAAAATGTG ATTGTATAGG 
2151 GCTTGATGAC ATCTGCGACC ATTGAAGAAA AGATATACAA ATTGCAGGTT 
2201 TTGAAGGGCG CTCTGTTCAG GACAGCTACG GAGCAAAAAG AGGAAACACG 
2251 TTACTTCAGC AAGAGTGAGA TTCAAGAGCT ATTTAGTTTG CCACAACAAG 
2301 GATTTGATGT TTCCCTCACA CATAAGCAGT TGCAAGAAGA GCATGGTCAA 
2351 CAAGTTGTTC TGGATGAGTC GTTGAGGAAG CATATACAGT TTCTGGAGCA 
2401 ACAAGGAATA GCCGGTGTGA GTCATCACAG CCTCCTATTC TCTAAAACTG 
2451 CAACCCTGCC CACTCTGAGT GAGAATGATG CACTGGACAG CAAACCTCGG 
2501 GGCATGCCCATGATGCCCCA GCAATATrAC AAGGGATCOT CATCTGACTA 
2551 TGTCGCGAAC GGGGCATCTT TTGCGCTGAA GCCAAAGGAT GAAAGTTTCA 
2601 CTGTTCGAAA CTACATTCCA AGTAACAGAA GCGCAGAGAG TCCTGAAGAG 
2651 ATAAAGGCAA GAATCAACCG GCTTTCACAG ACCCTCTCCA ACGCTGTGCT 
2701 GTTGTCGAAG CTACCAGATG GTGGTGAGAA GATAAGGAGG CAGATAAATG 
2751 AGCTGGACGA GAAGCTGACT TOTGCTGAGA AGGGGCTGAA GGAGGGGGGC 
2801 ACTGAAGTGA TTTCCTTGGA TGACTGATCC AAGACATGGA GAGTCTGTGC 
2851 TCGGCAAAAG TAAA 

Figure 5: AtlnoSO and related sequences 
Figure 5A 

>Atino80 coding sequence and derived protein 

ATGGATCCTTCAAGACGACCACCGAAGGACTCTCCTTACGCGAATCTATTCGATCTCGA 

GCCGTTGATGAAGTTTAGAATTCCGAAACCTGAAGATGAAGTTGATTATTATGGGAGTA 

GTAGCCAGGATGAAAGTAGAAGCACTCaaggtggggtagtggcaaactacagcaatgggtctaaatcgaga 

atgaatgcgagctccaagaagagaaagcggtggacagaagctgaggatgcagaggacgatgatgatctctacaatcaacat 

gttactgaggagcactaccgatcaatgcttggggagcatgtacaaaaattcaaaaataggtccaaggagaclcaagggaatcc 

tcctcatctgatgggttttccggtgctaaagagcaatgtgggcagttacagaggtaggaaaccagggaatgattaccatgggag 

gttctatgacatggacaactctccaaattttgcagctgatgtgaccccacataggcgaggaagctaccatgatcgtgatattacac 

ccaagatagcatatgaaccttcgtatttggacattggtgatggtgtcatrtacaaaatccccccaagttatgacaagrtggtggrat 

cattaaacttaccgagcttttcagacattcatgtggaagaamtacttgaaaggaactctggatctGAGATCATOGCAG^ 

ACTGATGGCAAGTGATAAAAGGTCTGGAGTAAGAAGGCGTAATGGAATGGGTGAGCCT 

CGACGTCAATATGAATCTCTTCAAGGTAGAATGAAGGCCCTGTCACCTTCAAACTCCAC 

CCCAAATTTTAGCCTCAAGGTGTCAGAAGCTGCAATGAATTCTGGCATTCGAGAAGGAT 

CTGCTGGAAGTACTGCACGGACAATTCTGTCTGAGGGTGGTGTTTTACAGGTCGATTAC 

GTGAAGATTCTGGAGAAGGGGGATACATACGAGATTGTTAAACGAAGTCTACCGAAGA 

AGCTGAAAGCAAAGAATGATCCTGCAGTCATTGAGAAAACAGAAAGGGATAAAATTAGA 

AAAGGCTGGATCAATATTGTCAGAAGAGATATAGCAAAACACCATAGAATTTTCACTAGT 

TTTCATCGTAAACTATCAATTGATGCGAAGAGGTrTGCAGATGGTTGCGAAAGAGAGGT 

GAGAATGAAGGTGGGTAGATCATACAAAATCCCAAGAACTGCACCAATTCGCACTAGGA 

AGATATCGAGAGACATGCTGCTATTCTGGAAGCGATATGACAAGCAGATGGCAGAAGA 

GAGGAAAAAGCAAGAAAAGGAAGCTGCAGAGGCTTTTAAACGTGAACAGGAGGAGGGA 

GAGTCAAAAAGGCAGCAACAAAGGCTGAATTTCCTTATTAAACAGACTGAGCTTTACAG 

TCACTTCATGCAAAAGAAGACCGATTCGAATCCTTCCGAAGCCTTACCAATAGGTGATG 

AAAATCCGATTGACGAAGTGCTCCCAGAAACTTCAGCGGCAGAACGTTCTGAGGTAGA 

GGATCGTGAAGAGGGTGAAGTGAAGGAAAAGGTCTTGAGAGGTGCGCAAGATGCGGTG 

TCTAAGCAGAAGCAAATAAGAGATGGATTTGACACTGAATATATGAAGCTAGGCCAAACT 
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TCTGAAATGGAAGGTCCTTTAAATGATATATCAGTTTCTGGCTCGAGCAATATAG^ 



SSSaatgaatggttctcaaaaggaattgagaatcatgctgaacacggaggcac^^ 
SX^atgcgat^t^^^^^^^^ 



AG^TG^^A^^^^^ 
^?^TrAScAAcS^^GAAGGTTTGC^^ 

SSS^^TTCTGGTGGTCAAAATC^^^ 



^Sg^act^S^a^^^ 



a^?^^tcgaS?cTg^^^^ 

AGAACTTCCAGTTGTGCAGCCTGCGCTTC^ 
CT^^GCAAAGTTTO^^ 

aaI^g^^caJS^gatg^ 

^^GGC^TGGA^^^ 
CATAGGAGCGATAT^^^ 

agatgtta 
cagcttgt 

^ag^STtagatgStc^^^^ 



^atgttactgtttatcgtctcatctgtaagg^ 

rATAArGGACAGGAACCTTTGGAAGAACCGGAAAAGCCAAAATCCAGTAATAAAAAGAG 
?Ar^rTGCT^CAAA^^^ 

GTAGCTCCGCTAACTAA 
Figure 5B 

>Derived AtlnoSO protein sequence .^.^x«„o«/^n.i=eDCTon 
MDPSRRPPKDSPYANLFDLEPLMKFRIPKPEDEVDYYGSSSQDESR^^ 

GWANYSNGSKSRMNASSKKRKR\An-EAEDAEDDDDLYNQHW^ 
GEHVQKFKNRSKETQGNPPHLMGFPVLKSNVGSYRGRKPGNDYHGRFYDM 



DNSPNFAADVTPHRRGSYHDRDITPKIAYEPSYLDIGDGVIYKIPPSYDK 

RPQYESLQARMKALSPSNSTPNFSLKVSEAAMNSAIPEGSAGSTARTILS 
EGGVLQVHYVKILEKGDTYEIVKRSLPKKLKAKNDPAVIEKTERDKIRKA 
WINIVRRDIAKHHRIFTTFHRKLSIDAKRFADGCQREVRMKVGRSYKIPR 
TARRTRK^RDMLLFWKRYDKQMAEERKKQEKEAAEAFKREQ 

QQQRLNFLIK^^^ 

pseveiSeaelk^ 
lndisvsgsTnidlhnpstmpvtsw 

YEQGLNGILADEMGLGKTIQAMAFLAHLAEEKNIWGPFLWAPASVLNNW 

ADEISRFCPDLKTLPYWGGLQERTILRKNINPKRMYRRDAGFHILITSYQ 

LLVTDEKYFRRVKWQYMVLDEAQAIKSSSSIRWKTLLSFNCRNRLLLTGT 

PIQNNMAELWALLHFIMPMLFDNHDQFNEWFSKGIENHAEWGGTL^^^^ 

NRLHAILKPFMLRRVKKDWSELTTKTEVTVHCKLSSRQQAFYQAIKNKI 

sS:elfSrgqft^^ 

?^^NSLLPHPF^^^ 

rgisresflkhfniyspeyilksifpsdsgvdqwsgsgafgfsrlm^^^^ 

psevgyu».lcsvaerllfsilrwerqfldelvnslmeskdgdlsdnnier 

vktkavtrmllmpskvetnfqkrrlstgptrpsfealvishqdrflssik 

llhsaytyipkarappvsihcsdrnsayrvteelhqpwlkrlligfarts 

eangprkpnsfphpliqeidselpwqpalqlthrifgscppmqsfdpak 

qlvmtgghvqgddflgaadwsllmddaeaaqleqkfrelplqvkdr^^ 

KTKRIRIlSrEGDATLEELEDVDRQDNGQEPLEEPEKPKSSNKKRRAASNP 

tSISpi^Kl^NGE^^^^^^ 
DPSSSAN* 

I 

>Alignm^ert of Atlno 80 sequence and public sequence. At3g57300. showing splicing 
difference 



Query: claimed sequence 

Sbjct: gi|18410689lreflNM_115590.1| (AGI:At3g57300) 

oue..: X tmtnn^ ntn^nm mH!?^^^^^^^ To 

Sbjct: 1 UgUtccttcUgicgaccaccgaaggactctccttacgcgaatctatt^ 60 
Query: 61 ccgttgatgaagtttagaattccgaaacctgaagatgaagttgattattat^ 120 
Sbjct: 61 ciUUaUaiUttigiUtcigaaacctgaagatgaagttgattattatgggagtagt 120 



Query: 121 agccaggatgaaagtagaagcactcaaggtggggtagtggcaaactacagcaatg^ 180 

Query, i^i 1 Ml I HI I HI 1 1 1 1 II 1 1 1 1 M 1 1 1 II M 1 1 M H II 1 1 1 II I M II M 1 1 1 I Ml 

Sbjct: 121 igccaggatgaaigtagaagcactcaaggtggggtagtggcaaactacagcaatgggtct 180 

Query: 181 aaatcgagaatgaatgcgagctccaagaagagaaagcggtggacagaagctgaggatg^ 240 

Sbjct: 241 giggicgaigatgatctctacaatcaacatgttactgaggagcactaccgatcaatgctt 300 

Query: 301 ggggagcatgtacaaaaattcaaaaataggtccaaggagactcaagggaatcctcctcat 360 
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Sbjct: 301 
Query: 361 c 



Sbjct: 421 iiUiiUciUgggaggttctatgacatggacaactctccaaattttgcagctgatgtg 480 

Ouery: 48X -ccacataggc^^^^^^^ ^ 
Sbjct: 481 U!:iciciUiUUiUUcUicatgatcgtgatattacacccaagatagcatatgaa 540 

— S41 -tcgtatttgg^^^^^^^ ^ 
Sb^ict : -S41- 4UUiUUUgi«ttggtgatggtgtcatctacaaaatccccccaa.gttatgacaag 600 

— eol --^---Itnnnntt?^^^^^^^ To 

Sbjct: 601 UiUggcaiciUlaicttaccgagcttttcagacattcatgtggaagaattttacttg 660 

::: ^^^^^^^^^ 

X .^A. OAf\ 



m 1 n 1 1 III m n I III n m n m 1 1 III M 111 1 1 1 111 1 1 11 1 1 1 j. 1 1 > » ' ' 

igigagcatgtacaaaaattcaaaaataggtccaaggagactcaagggaatcctcctcat 360 

ctgatgggttttccggtgctaaagagcaatgtgggcagttacagaggtaggaaacca^^^ '"n 

IIMlllllllllllllllllllllll'''<lll>'l''''''''''l'''''''iii,il 
rUaUgUtttccggtgctaaagagcaatgtgggcagttacagaggtaggaaaccaggg 

J ai-i-'H-fTnaactaaiiata 



420 
420 



:::: :::: =eb^^^^^ 
:::: ehihihiei^^ :::: 

Query: 1141 gcagatggttgccaaagagaggtgagaatgaa^^^^ "00 

Sbjct: 1141 iiiUUUUiiiiiUiUU^^^^^^ 

Ouery: 1201 -^—aatt^^^^^^^ 

Sbjct: 1201 ii:UUcciiUcicictigg;agatatccagagacatgctgctattctggaagcgatat 1260 
Ouerv: 1261 -^-gcagatggcagaagagaggaaaa^^^^^^ 

Sbjct: 1261 UUiUiUUiciiiiUiiggaiaaagcaagaaaaggaagctgcagaggcttttaaa 1320 
Query: 1321 cgtgaacaggagcagcgagagtcaaaaaggcagcaacaa^ "80 
Sbjct: 1321 UUiUiiUUiUUUUiiii^ 
Ouerv: 1381 -gactgagc^^^^^^ 

Sbjct: 1381 UUcUiiUUiciiicicUciUcaaaacaagaccgattcgaatccttccg-agcc 1440 

. ^x . — 1 cnn 



Query: 1441 ttaccaata? 

iiiiiiiir 
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Sbjct: 1441 ttaccaataggtgatgaaaatccgattgacgaagtgctcccagaaacttcagcgg -agaa 1500 

Oue.v: ISOl -tctgagg^a^^^^^^^^ "J 
Sbjct: 1501 cUtiiiaiUagaggatcctgaagaggctgaactgaaggaaaaggtcttgaga.,ctgcc 1560 

Query: 1561 caagatgcggtgtctaagcagaagcaaataacagatgcat^ 1620 
sj... 1561 UiUUUUU^iiUiUiiU^^^^^^ 

1621 ctacgccaaa^^^^^^^ -J 
Sbjct: 1621 c[icicciaicUrtgiiiiggaaggtcctttaaatgatata1:cagtttctggct«.gagc 1680 

Ouerv: 16B1 -atagatt.gcata^^^^^^^ -0 
Sbjct: 1681 iiUtiUtttg«t;Uccatctacaatgcctgttacatcaacagttcagactc.:agag 1740 

Ou"e.y: 1741 "atttaaaggaacc^^^^^^ 

Sbjct: 1741 UiltUiiiUiciUUiiUitUiaaatgaaaggccttcagtggctagtcaattgt 1800 

Ouerv: 1801 -gagcagggttt^^^^^^^ -° 
Sbjct: 1801 liUiUagiUttiiitggcUacttgctgatgaaatgggcttgggtaagactattcaa 1860 

Sbjct: 1861 ilUUUUiciUUiciUU;rtgaggaaaagaacatttggggtccatttcttgtt 1920 



1980 



Ouery: 1921 ^ttgcccctgcctctgttctta^^^^^^ 
Sb3ct: 1921 UUUcUiiUiiUiciUil^Uitgggctgatgaaatcagtcgtttctgtc -tgac 1980 

Ouery: 1981 ^tgaaaactctt^^^^^^^ -J 
Sbjct: 1981 UUiUrtUUciliiUgggaggattacaagaacgaacaattttaaga^agaatatc 2040 

Ouery: 2041 --aag^^^^^^^ -J 
Sbjct: 2041 iilUUigcgtaUiiccgii^ggatgctggctttcatattttgattactagctatcag 2100 

Ouery: 2101 -attagtcactgatgaaaagtatt^^^ "J 
Sbjct: 2101 UiUiiUicUiUiUigUitttcgccgggtgaagtggcaata'tatggtgctagat 2160 

Ouery: 2161 --ccaagcaatca^^^^^^^ -° 
Sbjct: 2161 UggcccUgcUtcaagagttcctccagtataagatggaaaacccttcttagttttaac 2220 

^ery. 2221 tg-ggaaccgattgcttctgactggt^ 

Sbjct: 2221 UiUUiccUUUUUUcUUiiiccaattcagaacaacatggcagagttatgg 2280 

Ouery: 2281 --gctgcatttc^ -° 
Sbjct: 2281 UccUcUciUiiU^igccaatgttgtttgacaaccatgatcaatttaatgaatgg 2340 

Ouery: 2341 --aaaagg^^^^^^^ -J 
Sbjct: 2341 UUciiiigUaUgigiatcitgctgaacacggaggcactttaaatgagcaccagctt 2400 

Query: 2401 aacagactgcatgcgatcttgaaaccgttcatgcttcgacgggtaaaa^ 2460 
Sbjct: 2401 iiiiUiUUUUiUUUii^^^^^^^^ 

Query: 2461 ^^^nttnt M itTtT^n utnt un^T^ ZZ 

Sbjct: 2461 UtiiicUicMUiigic^giagttacagtacactgcaagctcagttctcgacaacaa 2520 
ouerv 2521 actttttatcaggctattaagaacaaaatttctctggctgagttgtttgatagcaaccgc 2580 
Query. 2521 gctttttatcaggc ^ 111111111111111111111111111 I I < I < • "iiiUi™ 



~ — TiiMiniiniiiiiiiMiliiiiiiililiii I iiniimuuniiiiii 

Sbjct: 2521 UtUtUicUiciitUagUiaaaatttctctggctgagttgtttgatagcaaccgc 2580 

Query: 2581 ggacaatttactgataagaaagtattgaatttaatgaatattgtcatt«^^ 2640 

I I 1 III I 1 I I II 1 1 I I I I I I I 1 I H I I 1 1 1 1 I I I I 1 " ociin 

Sbjct: 2581 iUciiiUiUgitiigUUUtUaatttaatgaatattgtcattcaactaaggaag 2640 



• m 
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2700 



Sbjct: 264X UUiUiUUiciUUUitciiiUUtgaagggagctc^^^^^ 2700 



2701 gtgacttccaattctcttttgccccatccctt^^^^^ 



2760 



Query: 2701 11^1 m 2,60 

Sbjct: 2701 UUcUcciiUcicitttgccccatccctttggtgagctagaggatgtacattattct 2760 



Query: 



2820 



sbjct: 276X iUUi;uuu;;u;u;;;«;ga;;=ct«gci4^^^ 2320 



Ouexy: 2821 -ttctgaaacattttgttcttct^^^ 

Sbjct: 2821 iiUcUiiicatUUttctUUtcgggcgtggcatctcaagagaatcttttctgaag 



Query: 



Sbjct: 2881 



2881 cattttaatatatattcacctgagtatattcttaagtcaatatt^ 

UUUiiUUUU«cctgigtatattcttaagtcaatattcccatct^ 



2880 
2880 

2940 
2940 



Query: 



Sbjct: 2941 gtagatc 



Query: 



3001 ccatcagaagttggatatctggctctgtgttctgttgca^ 



3060 



3120 



Sbjct: 



3X21 iUUUitiu;icUi-i;;^^^^^^^ 



3240 



Ouerv: 3X31 -gatgccatcaaaagttgaa^^^^ 

Sbjct: 3181 UUUccitciiiiitUiaicgaattttcagaaaaggagactaagcacagggcctacc 3240 



Query: 3241 <f^-cttcatttgaag=gctagtgatc^^^ 

Sbjct: 324X iUccUUltUiUcJcUJUatctctcatcaggataggtttctttcaagtatcaaa 



a 3300 
3300 



Query: 3301 
Sbjct: 3301 



3360 



itciUciUcUUUtaittitatcccaaaagccagagctccacctgtaagcattcat 3360 



Query: 336X tgctcggacagaaattcggcatacagagttacagaagaattacatca^^^ 



3420 



n?i mm I mm I 111 mil 11 1 1 111111111111111111111111111111 
Sbjct: 336X UUUiUiUiiUUiciticUiUt^^^^ ^^2° 



Query: 342X -actattaatcggttttgca^ 

Sbjct: 342X iUcUtliiiUiitUicic^Ui^icagaagctaatggacccaggaagcctaacagc 

Querv 34ex tttccacatcctttaatccaagaaattgattcagaacttccagttgtgcagcrt^^ 
Query. , , | , ,T| | 1111111111111111111111111111111111111111111 "<'''''' 

Sbjct: 348X Utccacaicrtitaatccaagaaattgattcagaacttccagttgtgcagcctgcgctt 

Query: 354X caactgacacacagaatatttggttcttgccctccaatgcaaagttttgacccagcaaag 



3480 
3480 

3540 
3540 

3600 



m 
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MtlltiltlllllllllllMMlIltllMilllMliniltlMlf IliMIilH 
Sbjct: 3541 caactgacacacagaatatttggttcttgccctccaatgcaaagttttgacccagcaaag 3600 

Query: 3601 ttgctcacggactctgggaagctgcagacacttgatatattattgaagcggcttcgagct 3660 

tiuiiniitiMititiiiiiMiiiiiiiiiiiMiniiiiiiiiiniiiitnl 

Sbjct: 3601 ttgctcacggactctgggaagctgcagacacttgatatattattgaagcggcttcgagct 3660 
Query: 3661 ggaaatcacagggtgctcctgtttgcacaaatgacaaagatgctgaacattctcgaggat 3720 

itiMiiiiiniiiiiiiiiittiiiiiniiiiiiNiMiiiiiiiniiiiinii 

Sbjct: 3661 ggaaatcacagggtgctcctgtttgcacaaatgacaaagatgctgaacattctcgaggat 3720 

Query: 3721 tatatgaactatagaaagtacaagtacctcaggcttgatggatcctccaccatcatggat 3780 

lllllllllllltlllllllllMlllllllllllllMliniMIIIIIIIIIINll 
Sbjct: 3721 tatatgaactatagaaagtacaagtacctcaggcttgatggatcctccaccatcatggat 3780 

Query: 3781 cgccgagatatggttagggattttcagcataggagcgatatttttgtattcttgctgagc 3840 

M I i ) t M I t I I i ) I i 1 I 1 I ) I M I 1 1 1 1 1 i I 1 1 I 1 I 11 I 1 i I I 1 1 [ I i I i.t I ( I I i t i I 
Sbjct: 3781 cgccgagatatggttagggattttcagcataggagcgatatttttgtattcttgctgagc 3840 

Query: 3841 accagagctggaggacttggtatcaacttgacggctgcagacactgtcattttctatgaa 3900 

lUMIIItllllllitiiliMINIMIMMDIIiMllllllllllMMIt 11) 
Sbjct: 3841 accagagctggaggacttggtatcaacttgacggctgcagacactgtcattttctatgaa 3900 

Query: 3901 agtgattggaatcccaccttggatttacaagctatggacagggctcatcgtcttggacag 3960 

lllltll IlliniilMIIMIM INIIIMIitlllllMiMnilltllllllll 
Sbjct: 3901 agtgattggaatcccaccttggatttacaagctatggacagggctcatcgtcttggacag 3960 

Query: 3961 acaaaagatg 3970 

llMUIill 
Sbjct: 3961 acaaaagatg 3970 



Score » 1001 bits (505), Expect « 0.0 
Identities « 522/528 (98%), Gaps » 6/528 (1%) 
Strand «■ Plus / Plus 



Query: 3997 gagacggtggaagagaaaattttgcacagggcaagtcagaaaaatacagttcaacagctt 4056 

I IIIMtllltlttIlllllilll)llllllNllMnillini)llltlllM(iH 
Sbjct: 3970 gagacggtggaagagaaaattttgcacagggcaagtcagaaaaatacagttcaacagctt 4029 

Query: 4057 gttatgactggagggcatgttcagggtgatgattttcttggagctgcggatgtggtatct 4116 

111 111)11 nni II tiiMiitiiniMiiii Ml 1)11 iiiiiHiin till 111 

Sbjct: 4030 gttatgactggagggcatgttcagggtgatgattttcttggagctgcggatgtggtatct 4089 

Query: 4117 ctgctaatggatgatgcggaggcagcacaactggagcagaaattcagagaactaccatta 4176 

IN II nil nilllllMl 111 II lit tllIMM II Mil IIIINMI 11)111)1} 
Sbjct: 4090 ctgctaatggatgatgcggaggcagcacaactggagcagaaattcagagaactaccatta 4149 

Query: 4177 caggtaaaggacaggcagaagaaaaagacgaaacgtatcagaatagatgctgaaggagat 4236 

MM )l)))MMMlllMMIIllMllllMlllllMIMIIIMtMM 
Sb3ct: 4150 cagg acaggcagaagaaaaagacgaaacgtatcagaatagatgctgaaggagat 4203 

Query: 4237 gcaactttggaagagttagaagatgttgaccgacaggataacggacaggaacctttggaa 4296 

MIMMMMMIMIMMMlMMMMltMMMMIIIMMMIMMMM 
Sbjct: 4204 gcaactttggaagagttagaagatgttgaccgacaggataacggacaggaacctttggaa 4263 

Query: 4297 gaaccggaaaagccaaaatccagtaataaaaagaggagagctgcttcaaatccgaaagct 4356 

1 1 1 1 1 11 1 1 M I M I M M 1 1 M M II II II II ) 1 1 1 M II I ))) 1 1 1 M 1 1 1 1 1 1 II 1 1 
Sbjct: 4264 gaaccggaaaagccaaaatccagtaataaaaagaggagagctgcttcaaatccgaaagct 4323 



Query: 4357 agagctcctcagaaagcaaaggaagaagcaaatggtgaagatactcctcagaggacaaaa 4416 
IIMIIIMMIIIIIIIIIIIIMMMIMMMMMMMMMIIMIMMIM 



• 
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4383 



4476 



Sbjct : 4324 agagctcctcagaaagcaaaggaagaagcaaatggtgaagatactcctcagaggacaaaa 

Query: 44XV ^^g^taaagagacaaacaaagagcata^ 
Sbjct: 4384 iiiiUiUUiciiiciiiiigiiiaaacgaaagtcttgaacctgtattctctgcctct 4443 

Sbjct: 4444 ^taacagaatcaaataaaggattcgatccaagtagctccgctaactaa 4491 

Figure 6: 

>AtRvb1 (At5g22330) 

>2564051 CDS from MWD9 (protein BAB08331) 




gattgctactcacacccatatcaaaggccttggcclcgagccaactggta 
tccclataaaattggcagctggatttgttggtcaacttgaggctagagag 




ggctcttttgcttgctggacctcctggaactgggaaaacagctttggctc 
ttggaatctctcaagagctgggaagcaaggttccattctgtccaatggtt 



gaattttagacgtgccattggtctacgtatcaaggaaaccaaagaagtct 

atgaaggggaggtcaccgagctgtcaccagaagaaactgaaagcctcact 

ggaggttatggtaaaagcatcagccatgttgtaattacactcaagacagt 

caaaggaaccaaacatctgaaattggatcccactatctatgatgccttga 

ttaaggaaaaggtagctgtaggagatgtaatctatatcgaagcaaacagt 



tctggaagcagaagaatatgttccacttcccaaaggagaggtccacaaaa 
agaaagagatagtgcaggatgtcacactccaagatctggatgcagcaaat 
gctcgacctcaaggtggccaggatatactttctttgatgggccaaatgat 



ica 



aggttgtgaaccgatatatagatgaaggtgtggcagagcttgttccagga 

gttctatttattgatgaggttcatatgcttgatatggagtgcttctcata 

cttgaaccgtgctctlgagagctcattatctccgatagtgatatttgcaa 



caaat 

ggagtccctattgatctattagatcgattggttatcatccggactcaaat 
ctatgatccctctgaaatgatccagattatagccattcgtgcgcaagttg 
aagaattaaccgtggatgaagaatgcttggttctacttggggagattggg 
caaagaacttcactaaggcacgctgtgcagcttctgtctcctgccagcat 



-17- 



tgtagcgaaaatgaatggccgtgacaatatttgcaaggctgatatagagg 

aagtaacatcactctacttggatgctaaatcttcagcaaagcttttgcat 

gagcaacaagaaaaatacatctcatga 

Figure 7: 

>AtRvb21 (At5g67630) 

>At5g67630 and 3'UTR (prot BAB08471.1) 
atggcggaactaaagclatcagagagtcgggacttaaccagagtcgagcg 

aatcggcgcacactcacacatcagaggactaggtctcgactclgccctcg 

aa 



ggctattctaatagcgggtcaacccggaacgggtaagacagcgattgcaa 

tgggtatggcgaaatctcttggcttggaaactccttttgcgatgattgca 

ggaagtgaaattttctcattagagatgtcaaagacagaagctttgactca 



tgtgtatgatatgggagctaagatgattgaggctttgaacaaggagaaag 



aagcttggaagatcgttttcgaggtctcgtgattatgatgctatgggtgc 
gcagaccaagtttgtgcagtgccctgaaggtgagttgcagaagaggaaag 
aggttgtacattgtgtcactcttcacgagattgatgttatcaacagcagg 
acacaagggtttcttgcccttttcactggcgatactggagaaatccgatc 



gaaaagcagagatagttcccggagttctcttcattgatgaagtccacatg 

ctcgacatcgaatgcttctcattccttaaccgagctctagaaaacgaaat 

gtcaccaatccttgtggtggcaacaaaccgaggagtgacgacaatccgtg 

gcacaaaccagaaatcaccacacgggatcccgattgatctccttgaccgt 

cttctcatcatcactacccaaccttacacagacgatgacataaggaagat 

attagaaatccgttgccaagaggaagacgttgagatgaacgaagaggcca 

aacagcttttgacattgatcggacgtgatacatctctaaggtatgcgatt 

catcttataaccgcagctgcattgtcatgccagaaacggaaagggaaagt 

cgtggaggttgaggatattcagagagtttacagactgttcttggatgtga 

ggagatcgatgcagtatcttgttgagtatcagagtcagtatatgttcagt 




-18- 

gaaccaatcaaaaacgatgaagctgctgcagaagacgaacaagatgctat 
gcagatctgaGGATCCACCTCTGTTTGCGTTATTTATCATGTTTCGTGGT 
GATATGTATGATTAGGATGTTGAACTCGGATTTATG I I I M I I I I I 1 I I A 
AGTTGTGACGAGATTCGGTTCTAGAAAATGATTTAACCAAGTTCAATACA 
GATCGGTTTGGTACAAAACAAAAAAAAAAAAAAA 

Figure.8: 

>AtRvb22 (At3g49830) 

>At3g49830 prediction (protein CAB66921.1) 

atggcagaactaaggttatcagaaactcgagacttaactaggatcgaaag 

aatcggagcacactcacacatacgaggtttaggtctcgactcagtactcg 

agccacgagccgtatccgaaggaatggttggtcaaatcaaagcacgtaaa 

gccgccggagtaaccctcgagttgatcagagacggcaaaatctcgggtcg 

ggctatacttatagcgggtcaacccggaacgggtaaaatcgcaatagcaa 

tgggtdtagcaaaatcacttggacaagaaacaccattcactatgattgca 

ggaagtgagatcttttctttagagatgtcaaagactgaagctttaactca 

agcttttcgtaaagctattggtgttaggatcaaagaagagactgacgtga 

tagaaggagaagttgtgacgatttcgattgatagacctgcttcttctggt 

ggttctgtgaagaagactgggaagataacaatgaagacgactgatatgga 

atctaattttgatttgggatggaaattgattgagccattggataaggaga 

aagtacagagtggtgatgttattgttttggataggttttgtgggaagatt 

actaagcttggaagatcttttacgaggtctagagattltgatgttatggg 

ttcaaagactaagtttgtgcagtgccctgaaggtgagcttgagaagagga 

aggaggttttgcattctgtcacacttcatgagattgatgttattaatagc 

aggactcaagggtatctagccctcttcacaggtgatacaggcgagattcg 

ttcagaaacccgagagcaaagcgatactaaagtggcagagtggagagaag 

aagggaaagctgaaatagtacctggtgttctcttcattgatgaagtccat 

atgcttgatatcgaatgcttctctttcctgaatagagctctcgaaaacga 

tatgtcaccaatcctggtcgtggctacaaacagaggaatgacaacaatcc 

gaggaacaaaccagatatcagcacatgggatcccaatcgattttcttgac 

cgtcttcttattatcacaacacagccttacacacaagacgagatcagaaa 

tattttagagatccgttgccaagaagaggatgtggagatgaacgaggaag 

cgaaacagcttctgactttgatcggatgtaatacctcgcttaggtacgcg 

attcatctaatcaatgcagctgccctagcttgcctgaaacgtaaagggaa 



• m 
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ccaagagatcgatgcagtacttggttgagcatgagagcgagtacttgttt 
agcgtgcctataaaaaacacacaggaggctactgcaggagaagaaai 

acacgaggccatggaagtttga 
Figure 9: At3g57290 

>eIF3e Ath mRNA AF285832 (protein AAG53613.1) 



taatagcgcdaacctggacagacacttggtgtttcctatattcgagttc 

ctlcaagagcgtcagctttaccctgatgagcagatcctgaagtctaaaat 

ccagcttttgaaccagacgaacatggttgattacgccatggatattcaca 

agagtctctaccacactgaagacgctcctcaagaaatggtggagagaaga 

acagaggttgtcgctaggctcaaatctttggaggaggctgctgcaccact 

cgtgtcttttcttttgaaccctaacgctgtgcaggagctaagagctgaca 

agcagtacaatctccaaatgctcaaggaacgctaccagattggtccagac 

cagattgaggctt^taccagtacgccaagtttcagtttgaatgtggcaa 

ctattctggtgctgctgattatcttlaccagtacaggaccctgtgctcta 



ttgatgcaaaactgggatattgctcttgaagagcttaaccgtctcaaaga 

gattattgactcaaagagtttttcatcgccgttaaaccaggtgcagaaca 

ggatttggttgatgcattggggtctgtatatcttttttaaccatgataat 

ggaaggacacagatcattgatctttttaaccaagacaagtatctgaatgc 

catccaaactagtgctccacacttgctgcgctacttggcaactgctttca 



cagcaagagcaclactcctacaaagatccaattatcgagttcctggcatg 

tgtgtttgtcaattatgactttgatggggctcaaaagaagatgaaagagt 

gtgaagaggtcattgtgaatgatccattccttggcaagcgagttgaggat 

ggaaacttttcaactgtaccactgagagatgaatttcttgaaaatgcccg 

cctattcgtctttgaaacctattgcaaaattcatcaaaggattgacatgg 

gggtacttgctgaaaaattgaatclgaactatgaggaggccgagagatgg 

attgtgaacctaatccgcacctcaaagcttgatgccaagattgattctga 

gtcaggaactgtaatcatggagcctactcagcccaacgtgcatgagcagt 



-20- 



cagctcttggaacacacacaggcgcaagcaactcgctagtcaaaattttg 

ctgtggaagcctttccttgataaaactcaeGttcggttgactggaattat 

tttctttttcttgctctgagttcaccltttactttgaaaaagattattat 

ggagttgttctatttgaaatgttggatccacagattggaacattttccaa 

ccaaatcagcattttgtaaaaaaaaaaaaaaaaaaaaaaaa 
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Figure 10: plant homologs of Hw17 
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***:***:*:*:****:*.**• * *: • .*:**:*:::*♦:**.***:********* 

LVHLALQPKASPI-AAITILKKICDHPI.LIiTKKGAEGVI.EGMGEMLNDQDIGM>^KI^^ 
LVHS'SMQ— GSPLAAITILKKICDHPLLLTKKAAEGVLEGMDAMLNNQEMGMVEKMAMNL 
IVLSAFD— GSPLAALTILKKICDHPLLLTKRAAEDVLEGMDSTLTQEEAGVAERI-AMHI 
... ^*****:***************: .**.*****. *.::: *:-*::**:: 

ADMAHDDN— ALEVGQDVSCKI.SFIMSLI.RNLVGEGHHVI.IFSQTRKMLNI-IQEAIILEG 
ADMAHDDDDVEI.QVGQDVSCKLSFMMSLI.QNLVSEGHNVL1FSQTRKMI.NIIQEAIILEG 
ADNVDTDD~FQTKNDSISCKLSFIMSLLENLIPEGHRVIiIFSQTRKMI.NLIQDSLTSNG 
V; ,.,:****♦*:****.**: ***.************:**::: :* 

YAFLRIDGTTKVSDRERIVKDFQEGCGAPVFLLTTQVGGLGLTLTKATRVIVVDPAWNPS 
YKFLRIDGTTKISERERIVKDFQEGPGAPIFLLTTQVGGLGLTLTKAARVIWDPAWNPS 
YSFLRIDGTTKAPDRLKTVEEFQEGHVAPIFLLTSQVGGLGLTLTKADRVIWDPAWNPS 
********** .:* : *::**** **:****:************ ************ 

TD^fQSVDRAYRIGQTKNVIVYRLMTSATIEEKIYKLQVLKGALFRTATEQKEQTRYFSKS 
TDNQSVDRAYRIGQMKDVIVYRLMTSGTIEEKIYKLQVFKGALFRTATEHKEQTRYFSKR 
TDNOSVDRAYRIGQTKDVIVYRLMTSATVEEKIYRKQVYKGGLFKTATEHKEQIRYFSQQ 
************** *:*********.*:*****: ** **.**:****:*** 

EIQELFSLPQQGFDVSLTHKQI.QEEHGQQWI.DESI.RKHIQFLEQQGIAGVSHHSI.LFSK 
DIQELFSLPEQGFDVSLTQKQLQEEHGHQLVMDDSLRKHIQFLEQQGIAGVSHHSLLFSK 
DLRELFSLPKGGFDVSPTQQQLYEEHYNQIKLDEKLESHVKFLETLGIAGVSHHSLLFSK 
...******: ***** *::** *** :*:.*.•*::*** ************** 

TATLPTLSENDALDSKPRGMPMMPQQYYK6SSSDYVANGASFALKPKDESFTVRN-YIPS 
TAILPTLNDNDGLDSR-RAMPMA-KHYYKGASSDYVANGAAYAMKPKE— FIART-YSPN 
TAPIQAIQKDEEEQIR-RETALLI-GRASASISQDTVINGADYAFKPKDVNLDKRINISPV 
** : : * : . *.* * *** :*:***: : * * 

NRSAESPEEIKARINRI.SQTLSNAVL1.SKI.PD6GEKIRRQINEI.DEKLTSAEK6 LK 

STSTESPEEIKAKINRI,SQTLANTVLVAKI-PDRGDKIRRQINELDEKLTVIESSPEPI.ER 

DDKELSESVIKARLNRLTMLLQNKGTVSRLPDGGAKIQKQIAELTRELKDMKAA ER 

* . ***::***: * * :::*** * **::** ** -5*- • • s 
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